Monday, July 4, 2016

Fedora 23 boot time optimization

Before I start, if you are only interested in the solutions to my boot problems, skip directly to the "Let's get started" section further below.  For those of you interested in the background story, read on:

This past weekend I decided to fix a problem that has been bothering me for a while.  I have a perfectly good Intel based PC that I use at home for various tasks.

Its an older machine dating from 2006 runing on an Intel Core 2 DUO e6600 with 4 Gigabytes of RAM.  Of course there are no SSDs, its running on 5600rpm SATA drives.  Despite its low hardware specs, it should still be fairly useful if we aren't trying to run the latest games on it.  Besides I mostly use it to code in C++ and work on small projects.

Recently I've installed Fedora 23 from a live DVD, in order to have a server that I can access remotely.  Note that this was an "out-of-the-box" installation with no special requirements.

Immediately I noticed that the boot-time was relatively slow.  I didn't measure it at that time, but it was probably around 1.5 minutes or more.  Since I wasn't planning on using the system much, I didn't care to look into it either.  My plan was to boot it up in the morning, use it remotely and turn it off at night... boot time was not an issue.

Recently however, I found myself accessing it more often and the boot time started to become cumbersome.  Following some research online I found a few tools which anyone serious about fixing slow boot-times should get familiar with:

systemd-analyze
systemd-analyze blame
systemd-analyze critical-chain
systemd-analyze plot > somefile.svg

I used all four of these to gather the data that I needed to determine what was causing my booting woes and believe it or not, I went from over 3 minutes to less than 25 seconds.

Here's summary of various systemd-analyze boot times as I experimented with tweaks below:

[removed@removed ~]$ cat ./boot-time
Startup finished in 904ms (kernel) + 4.155s (initrd) + 2min 59.009s (userspace) = 3min 4.069s
Startup finished in 906ms (kernel) + 4.096s (initrd) + 1min 439ms (userspace) = 1min 5.441s
Startup finished in 905ms (kernel) + 4.089s (initrd) + 1min 1.978s (userspace) = 1min 6.973s
Startup finished in 904ms (kernel) + 4.162s (initrd) + 41.080s (userspace) = 46.147s
Startup finished in 904ms (kernel) + 4.148s (initrd) + 40.696s (userspace) = 45.749s
Startup finished in 905ms (kernel) + 4.209s (initrd) + 35.845s (userspace) = 40.959s
Startup finished in 905ms (kernel) + 4.183s (initrd) + 1min 158ms (userspace) = 1min 5.246s
Startup finished in 905ms (kernel) + 4.186s (initrd) + 34.855s (userspace) = 39.947s
Startup finished in 905ms (kernel) + 4.278s (initrd) + 30.879s (userspace) = 36.063s
Startup finished in 906ms (kernel) + 4.047s (initrd) + 31.017s (userspace) = 35.971s
Startup finished in 905ms (kernel) + 4.161s (initrd) + 30.502s (userspace) = 35.569s
Startup finished in 905ms (kernel) + 4.070s (initrd) + 29.762s (userspace) = 34.739s
Startup finished in 906ms (kernel) + 4.198s (initrd) + 19.584s (userspace) = 24.688s

To be fair, I added configuration to my system which only served to slow down boot time: SMB and NMB.  I wanted to share some files with my windows computers, but I now decided to disable it.

I'm not going to go through all of the changes that I made, the research I did and the reasons I chose to do certain things... a lot of it was hit-and-miss and getting data from all over the place.  I found that the Arch-Linux documentation is, as always, extremely helpful.  Here's an example:

https://wiki.archlinux.org/index.php/systemd#Journal_size_limit

I created a file containing my changes and the perceived effect on boot time and I will attempt to describe what I did from beginning to the end.

Let's get started:

First boot time trace using systemd-analyze >> ./boot-time.txt

Startup finished in 904ms (kernel) + 4.155s (initrd) + 2min 59.009s (userspace) = 3min 4.069s

My first attempt at reducing boot-time was done by checking the systemd-analyze blame, which showed that firewalld seemed to be a bit of a bottleneck.  The result was that I cut down boot time by close to a minute.  Why does firewalld take so long to start, and why does it block the boot process?  I don't know, I didn't research it yet.

Switch from Firewalld to IPTables (considerable difference)

# Removed firewalld - replaced with iptable
# sudo systemctl stop firewalld
# sudo systemctl disable firewalld
# dnf install iptable-services
# sudo systemctl enable iptables.service
# sudo systemctl start iptables.service

Startup finished in 906ms (kernel) + 4.096s (initrd) + 1min 439ms (userspace) = 1min 5.441s

Disable plymouth-quit-wait.service (considerable difference)

Another bottleneck was the plymouth-quit-wait.service.  This was an obvious one and many people recommend disabling it, but I'm not sure it should be, I still have to research this one further.  I have a feeling I would prefer to disable plymouth entirely.  Yet the difference on boot is considerable with a gain of over 20 seconds.  Note that you have to both disable it and mask the service for the gain to take effect.

# sudo systemctl disable plymouth-quit-wait.service
# sudo systemctl mask plymouth-quit-wait.service - BIG DIFFERENCE
Startup finished in 904ms (kernel) + 4.162s (initrd) + 41.080s (userspace) = 46.147s

Readahead on boot? (negligible or negative)

I tried installing preload to have some readahead capabilities, but this only helps applications after boot, and has a small negative effect on boot time, so I ended up removing it.

# sudo dnf install preload ->
# systemctl enable preload.service:
Startup finished in 904ms (kernel) + 4.148s (initrd) + 40.696s (userspace) = 45.749s


Note that the RedHat team decided to completely remove systemd-readahead whose sole purpose was to improve boot speed.  Further on this later.

Things are now still relatively slow and inconsistent:

Startup finished in 905ms (kernel) + 4.183s (initrd) + 1min 158ms (userspace) = 1min 5.246s

Disable Samba and NetBIOS (considerable difference)

The next step was to disable SMB and NMB which also had a considerable impact:

# DISABLED SAMBA and NETBIOS - BIG DIFFERENCE
# sudo systemctl stop smb.service
# sudo systemctl stop nmb.service
# sudo systemctl disable smb.service
# sudo systemctl disable nmb.service

Startup finished in 905ms (kernel) + 4.186s (initrd) + 34.855s (userspace) = 39.947s

DISABLE Libvirtd (minor)

I noticed a few services which were starting up which I didn't think I needed, so I researched them and disabled them.  I use virtualbox and have the virtualbox kernel driver installed.  I have no need for libvirtd or libvirt.  I saved 4 seconds but again, these are not always consistent differences.

# DISABLED LIBVIRTD- NOT SIGNIFICANT
# sudo systemctl disable libvirtd.service
startup finished in 905ms (kernel) + 4.278s (initrd) + 30.879s (userspace) = 36.063s

DISABLE ModemManager (minor)

The modemmanager seemed to be one that took a long time to startup but it was not a blocking service, so disabling it did not improve boot time significantly.   Still, if I don't need it, its still wasting precious time.

# DISABLED MODEMMANAGER - NOT SIGNIFICANT
Startup finished in 905ms (kernel) + 4.161s (initrd) + 30.502s (userspace) = 35.569s

DISABLE rngd (minor)

The random number generator caused errors in the logs since I have no hardware based seed.  I have no idea why this has to be loaded by default. 

# DISABLED rngd.service - NOT SIGNIFICANT
Startup finished in 905ms (kernel) + 4.070s (initrd) + 29.762s (userspace) = 34.739s

DISABLE systemd-journal (considerable)

Now another big one was the systemd-journal.service.  After several boot-ups, this one took approximately 13 seconds to load on average and it was blocking other services.

Apparently, the journal system is slow because it has a way of reading entries from one set of files and rewriting them to another set of files on boot.  It does not copy the files directly, but goes in each file and reads entries as objects and dumps these in another file.  The bigger your logs, the slower this will be.  The information I recite here is from an old bug report but things may have changed, I'm not sure.

My home PC had a /var/log/journal directory sized at 750Mbs and it took 13 seconds almost consistently.

On the other hand, my work PC runs Fedora 23 (same basic install) on a 2012 Dell with an Intel I7, 16 Gbs of RAM and 7200rpm drives.  My journal log was 1.58Gbs and the systemd-journal.service started in 4 seconds... That's a big difference and it all has to do with hardware.

There is a way to solve this problem for slower computers and the difference is a considerable 10 second improvement:

# SET MAX JOURNAL SIZE: https://wiki.archlinux.org/index.php/systemd#Journal_size_limit    -- BIG DIFFERENCE
Startup finished in 906ms (kernel) + 4.198s (initrd) + 19.584s (userspace) = 24.688s

... 24.688 seconds ...

Conclusion:

Apart from SMB and NMB, which I added and later removed myself, the bulk of these are relatively minor tweaks which should come out-of-the-box.  While I agree that its nice to have OS's able to use powerful hardware, I find its a waste to load services if they are not needed.  That being said, while I find that most RedHat folks are great and helpful, there are some points of views that I will disagree with whole-heartedly:  https://lists.freedesktop.org/archives/systemd-devel/2014-August/022002.html

This is just an example, but I find it hard to believe that support would be dropped for boot-time readahead just because "Nobody in the systemd team still works on a laptop with rotating media, hence nobody tries to optimize it in any way."  - that is a very poor excuse for not optimizing systems.

I agree that read-ahead does become cumbersome on SSD drives, but then at least provide it as an option during the installation of the OS.  Why should the multitudes of rotating-media systems suffer simply because the RedHat team all use SSDs?

At the end of the day I start to ask myself, why am I running Fedora again?

Thursday, January 14, 2016

CentOS 7 - No network device detected

While installing CentOS 7 on older hardware, I found a solution to the problem where the network device is not automatically detected during or after the system install.

Following some searching on google, the following forum post gave me the solution to this problem.  http://unix.stackexchange.com/a/200029

"This device uses the forcedeth driver which is disabled in the CentOS 7 kernel.
You can use the kmod-forcedeth driver from elrepo.org:
http://elrepo.org/linux/elrepo/el7/x86_64/RPMS/kmod-forcedeth-0.64-1.el7.elrepo.x86_64.rpm"

As the answer states, it is simply a matter of installing the RPM and rebooting the system.  To do this, I downloaded the file to a USB stick and ran the yum localinstall command as such:

# yum localinstall /run/media/.../kmod-forcedeth-0.64-1.el7.elrepo.x86_64.rpm


Thursday, December 10, 2015

Gray font, web accessibility and rude comments.

A few days ago I received an extremely rude comment.  It was alluding to the choice of font color that I've selected to display code snippets on this blog.  While trying to understand how I offended the poster, I realized that under some circumstances low contrast fonts can be extremely difficult to read.  Interestingly apart from the rant, the poster's comment's only clear indication of the problem were the following three words: "... using gray font!!!"

I initially chose dark gray font as a way to  differentiate between this blog's text and code snippets, or text as part of a file.  I did a bit of research after reading his/her post, and found that low contrast fonts are not conducive to creating an accessible or easily readable website.  NOTE: I did check the hexadecimal codes of the background and foregrounds against an WCAG 2.0 colour testing site, and found the the following:

My code foreground colour was gray #666666
The background colour is white #FFFFFF

The test was performed at:
http://webaim.org/resources/contrastchecker/

The results from the test are as shown in the following screenshot:

contrast test result: Normal Text WCAG AA: Pass, WCAG AAA: Fail | Large Text WCAG AA: Pass, WCAG AAA: Pass



Armed with this new knowledge, I will strive to write future posts in a more accessible format, while retaining the differentiation between plain text and code or configuration files.  If I have the time, I will endeavor to update my previous posts to this format, though I will not make any promise as to when that might happen or even whether I do it at all.

If anyone has  suggestions in terms of font size, style or anything else to help me achieve this goal, I would appreciate any comment. 

I will not apologize to the individual who left the incredibly rude remark on this site if he or she was offended, but I will thank him or her for raising my attention to a potential problem.  After all, that is the point of this blog:  Finding solutions to problems.

A note on comments:  I appreciate all comments, and I don't mind some level of rudeness, but seriously there are limits... For those of you who are very passionate and quick to anger, tone it down just a bit.  I would hate to delete comments that may be valid, simply due to a seriously wrong remark.


Cheers!

Monday, November 2, 2015

Bulk update passwords

Shell script to update passwords for multiple users all at once.

Create a file with the list of account user names:

$ vim names.txt

user1
user2
user3

Create a script which will go through each username and execute the password update:

$ vim userupdate.sh

#!/bin/bash

for i in `cat names.txt`
do
  echo $i
  # set password to 123 for each user
  echo $i"123" | passwd –-stdin "$i"
  # for reset password at first login
  chage -d 0 $i
done

Run the script:

$ chmod +x userupdate.sh

$ sudo userupdate.sh

This script was partially inspired by a how-to-forge article.

Wednesday, October 21, 2015

Checking file signature on Windows

The sysinternals suite provides the sigcheck.exe tool which is useful to verify the integrity of a file:

https://technet.microsoft.com/en-us/sysinternals/bb897441.aspx

Recently I had a strange issue on my Windows 10 tablet.   The Windows firewall asked me whether I trusted wuapihost.exe to communicate out on my private or public network.  This is an odd issue and is most likely a bug.  Information online is currently not very helpful.

The best thing I could do is simply verify the integrity and signature of the file by using sigcheck.exe

I used it to check both the MD5 hash and the certificate signature.  The tool also offers an option to have it uploaded and checked by "www.virustotal.com"; a subsidiary of Google.

The results of the tool:

c:\windows\system32\wuapihost.exe:
        Verified:       Signed
        Signing date:   2:11 AM 2015-07-10
        Publisher:      Microsoft Windows
        Description:    wuapihost
        Product:        Microsoft« Windows« Operating System
        Prod version:   10.0.10240.16384
        File version:   10.0.10240.16384 (th1.150709-1700)
        MachineType:    32-bit
        MD5:    7B8DF67BCA2EC042ED8B71F5226B51EE
        SHA1:   CEA9E6219086343472D050934CBAF21558DF67B5
        PESHA1: 2B5B80E0E70118E9AD667314CB7FBFD638A340AF
        PE256:  7E7B9738DE54A65D7DD09CB97F51394381BFA1334CE01A774BBC73528A765300
        SHA256: 001FF7CD1D524636F936814B9154C27971723C8B3F652CC3E03BD09BA4B21AA9
        IMP:    50A7A0582886E9AB08BEF947D1B09ADA

Wednesday, October 14, 2015

Authentication is required to create a color managed device kde vnc group

This is the message I get everytime I need to resize the screen with VNC using KDE / Plasma 5.

A bug report has been filed by someone else with Redhat in regards to this issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1149893

And Orion Poplawski posted a workaround which consists of:

"
You can place a .rules file in /etc/polkit-1/rules.d

I'm doing in 02-allow-colord.rules:

polkit.addRule(function(action, subject) {
   if ((action.id == "org.freedesktop.color-manager.create-device" ||
        action.id == "org.freedesktop.color-manager.create-profile" ||
        action.id == "org.freedesktop.color-manager.delete-device" ||
        action.id == "org.freedesktop.color-manager.delete-profile" ||
        action.id == "org.freedesktop.color-manager.modify-device" ||
        action.id == "org.freedesktop.color-manager.modify-profile") &&
       subject.isInGroup("nwra")) {
      return polkit.Result.YES;
   }
});
"

Tuesday, October 6, 2015

RedHat Software Collections - Directory Structure

In order to prevent non-standard Software Collection packages from interfering with standard ones, RedHat came up with a special directory structure to separate each packages into its own little world.

Here is an example of what the directory tree looks like for MySQL 5.5 from SC2:

(note: "tree" won't provide a clear view of only the directories that I want to show, so I had to put the pieces together)

/opt
└── rh
    └── mysql55
        └── root
            ├── bin
            ├── boot
            ├── dev
            ├── etc
            ├── home
            ├── lib
            ├── lib64
            ├── media
            ├── mnt
            ├── opt
            ├── proc
            ├── root
            ├── sbin
            ├── selinux
            ├── srv
            ├── sys
            ├── tmp
            ├── usr
            └── var
                ├── cache
                ├── db
                ├── empty
                ├── games
                ├── lib
                │   ├── games
                │   ├── misc
                │   └── mysql
                ├── local
                ├── lock
                │   └── subsys
                ├── log
                ├── mail -> spool/mail
                ├── nis
                ├── opt
                ├── preserve
                ├── run
                │   └── mysqld
                ├── spool
                │   ├── lpd
                │   └── mail
                ├── tmp
                └── yp

The service names are also somewhat different, with a very precise convention, making it easy to differ between installed versions:

/etc/rc.d/init.d/mysql55-mysqld

Note how the first portion of the service name specifies the name and version of the package.

======

For details on RH and Community Software Collections, visit the documentation at:

https://www.softwarecollections.org/en/docs/

Tuesday, September 29, 2015

KDE Plasma 5 launchers not working in Gnome 3

Today I discovered that manually created KDE launchers (.desktop files in $HOME/.local/share/applications/) are not wholly compatible with Gnome 3.

I won't go into details, but instead will point you to this bug report: https://bugs.kde.org/show_bug.cgi?id=321152

I'm mainly interested in ensuring this work on my system as I regularly switch back and forth between KDE 5 and Gnome 3 (don't ask me why, we all have our oddities), and I need to ensure that Launcher icons are available under both desktop environments.

When creating entries with KDE Menu Editor, you will find that these files look somewhat similar to the example below, which I created to launch JMETER:

[Desktop Entry]
Comment=
Exec=java -jar /home/<...>/<...>/apache-jmeter-2.10/bin/ApacheJMeter.jar
Icon=applications-utilities
Icon=office-chart-line-percentage
Name=JMETER
NoDisplay=false
Path[$e]=
StartupNotify=true
Terminal=0
TerminalOptions=
Type=Application
X-KDE-SubstituteUID=false
X-KDE-Username=


The highlighted line, Path[$e]=... is actually the cause of Gnome not displaying the launcher in it's menu. 

Simply adding a pound comment (#) symbol in front of it will allow Gnome to ignore that line and the launcher will function.

 ...
#Path[$e]=
 ...

When you have several such entries, it is easier to use SED in conjunction with FIND to edit all of them at once.
1
find . -type f -exec sed -i 's/Path\[/#Path\[/g' {} \;
It's not the most elegant solution... but it works... sort of...