Impactcore: Linux notes for a Systems Administrator: 2012

Wednesday, December 12, 2012

Extending root partition on the fly - Part 2

In my previous post I discuss some techniques to expand disks while ensuring zero downtime.

These techniques are not always viable as it will depend on the Operating System version you have, as well as LVM version. If you don't use LVM and instead use native linux partitions, then things can get a bit uglier and you will probably need one or two reboots. In CentOS 5.2 I can't properly re-scan the ISCSI devices. It seems support for it may have only been added in version 5.4, as RedHat began adding shell scripts to perform this operation.

In any case, for the latest version of RedHat and Centos (5.5,6,6.2,6.3) the previously described techniques work just fine. There is only one caveat. In order to prevent having to disable the volume group, unmounting the filesystem and stopping services that are using them, one must create a new "Physical Volume" using pvcreate, as I've done. The only problem with this, and it's not a big one, is that you end up with separated physical volumes all on the same LVM partition.

If you want to expand the disk but have only one physical volume in the LVM, it will be necessary to disable the volume group in order to use pvresize. Note that this implies shutting down services and unmounting the filesystem to be expanded.

Example:

1) Extend disk in vmware

2) Rescan the disk
# echo "1" > /sys/class/scsi_device/<device number>/device/rescan

3) resize the partition in question using fdisk <device>
# fdisk /dev/sdb

4) re-read the partition table:
# partprobe

5) If your service is apache:
# service httpd stop

6) Disable the volume group
# vgchange -a n <volume group name>

7) Physical Volume Resize
# pvresize /dev/sdb1

7.1) If you check your physical volume now you should see the free extents:
# pvdisplay

8) Re-enable the volume group
# vgchange -a y <vg name>

9) Re-mount the device
# mount /var/www (for example) or mount -a

10) Restart your service
# service httpd start

11) Now note that we've only expanded the Volume Group and neither has the Logical Volume or the File System been extended, but we can do these on the fly so it's o.k. if they are mounted.
# lvextend -l +<num of free extents> /dev/<vg name>/<lv name>

13) Resize the filesystem
# resize2fs /dev/<vg name>/<lv name>

In our case we find that it's helpful to test these procedures on clones and ensuring we have the most appropriate technique for the situation.

Tuesday, October 23, 2012

Extending root partition on the fly - linux on vmware

You've extended your VM's only disk by a bunch of Gigabytes. You have Apache / MySQL running on it and you can't afford any downtime. You now need the Operating System to recognize all that new space. What can you do? Expanding a root partition at runtime with a guest linux OS, requires a bit of planning but is still fairly straightforward.

The following was performed on a CentOS 5.5 Guest VM running on VMWare.

Let's see how much space we have right now:

[root@... ~]# df -h
Filesystem            Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       18G   11G 6.5G 61% /
/dev/sda1              99M   13M   82M 14% /boot
tmpfs                 2.0G     0 2.0G   0% /dev/shm

First, make sure the OS can recognize that the hardware actually changed. Rescan your SCSI device:

# echo "1" > /sys/class/scsi_device/<device number>/device/rescan

Next comes the fun part. This is where planning for disaster comes in handy; so even though you don't want to take any downtime, plan for it: Take a snapshot of your VM.

Format your device and add a new partition. In my case I have a /boot partition and a / partition so my new partition will have number 3 or /dev/sda3.

# fdisk /dev/sda

-- if we now try to create a new physical volume, it should fail until we run the partprobe command.

# partprobe

UPDATE: Since RedHat 6 (CentOS 6), you can use the partx command to force the changes to take effect on the partition table. Note that partx does not do the same validation as partprobe, so if you've made mistakes with your partition layout, data can be erased. If you are certain that your layout is correct, then proceed using:

# partx -l /dev/sda

And to force changes to take effect:

# partx -v -a /dev/sda

( See RedHat's recommendation at: https://access.redhat.com/site/solutions/57542 )

-- Create a physical volume from the new partition now that the kernel knows about it.

# pvcreate /dev/sda3

-- Extend the volume group to use up all space on the new physical volume.

# vgextend VolGroup00 /dev/sda3

-- Extend the logical volume by the number of free physical extents available in the group (use +)

# lvextend -l +3200 /dev/VolGroup00/LogVol00

-- Finally run an online resize of the mounted partition without affecting anything.

# resize2fs /dev/VolGroup00/LogVol00

That's it. Now run df -h:

[root@... ~]# df -h
Filesystem            Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      115G   11G   99G 10% /
/dev/sda1              99M   13M   82M 14% /boot
tmpfs                 2.0G     0 2.0G   0% /dev/shm

----------------- EXTENDING A SWAP PARTITION -----------------

If you are doing this on a SWAP partition, then make sure you follow these instructions:

After extending the VolumeGroup, disable your swap:

[root@... ~]# swapoff -a

Extend your swap logical partition ( by 320 Physical Extents in my case ):

[root@... ~]# lvextend -l +320 /dev/VolGroup00/LogVol01

Check your logical volume size:

[root@... ~]# lvdisplay /dev/VolGroup00/LogVol01
--- Logical volume ---
LV Name                /dev/VolGroup00/LogVol01
VG Name                VolGroup00
LV UUID                --- ---- ---- --- ----
LV Write Access        read/write
LV Status              available
# open                 0
LV Size                11.97 GB
Current LE             383
Segments               2
Allocation             inherit
Read ahead sectors     auto
- currently set to     256
Block device           253:1

Use mkswap to recreate a new swap partition. There is no need to worry about the data because when swap is disabled, it should not contain any data.

[root@... ~]# mkswap /dev/VolGroup00/LogVol01
Setting up swapspace version 1, size = 12851343 kB

Now restart your swap and check your memory:

[root@... ~]# swapon -a

[root@... ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         24106       2917      21188          0        371        484
-/+ buffers/cache:       2061      22045
Swap:        12255          0      12255

Activate memory in Linux at run time using bash - vmware

In following with my previous post about "Adding scsi device at runtime on linux guest VM," I am adding some information here on how to use a bash "for-loop" to activate memory; it was added at runtime on a VMWare guest.

I've tested this on a CentOS 5.5 system.

First, add the new memory using VMWare VSphere.

Second, find the new memory that is currently listed as "offline".

[root@... ~]# grep offline /sys/devices/system/memory/*/state
/sys/devices/system/memory/memory40/state:offline
/sys/devices/system/memory/memory41/state:offline
/sys/devices/system/memory/memory42/state:offline
/sys/devices/system/memory/memory43/state:offline
/sys/devices/system/memory/memory44/state:offline
/sys/devices/system/memory/memory45/state:offline
/sys/devices/system/memory/memory46/state:offline
/sys/devices/system/memory/memory47/state:offline
/sys/devices/system/memory/memory48/state:offline
/sys/devices/system/memory/memory49/state:offline
/sys/devices/system/memory/memory50/state:offline
/sys/devices/system/memory/memory51/state:offline
/sys/devices/system/memory/memory52/state:offline
/sys/devices/system/memory/memory53/state:offline
/sys/devices/system/memory/memory54/state:offline
/sys/devices/system/memory/memory55/state:offline
/sys/devices/system/memory/memory56/state:offline
/sys/devices/system/memory/memory57/state:offline
/sys/devices/system/memory/memory58/state:offline
/sys/devices/system/memory/memory59/state:offline
/sys/devices/system/memory/memory60/state:offline
/sys/devices/system/memory/memory61/state:offline
/sys/devices/system/memory/memory62/state:offline
/sys/devices/system/memory/memory63/state:offline
/sys/devices/system/memory/memory64/state:offline
/sys/devices/system/memory/memory65/state:offline
/sys/devices/system/memory/memory66/state:offline
/sys/devices/system/memory/memory67/state:offline
/sys/devices/system/memory/memory68/state:offline
/sys/devices/system/memory/memory69/state:offline
/sys/devices/system/memory/memory70/state:offline
/sys/devices/system/memory/memory71/state:offline

Then, use a for loop to activate that memory:

[root@... ~]# for memcount in {40..71}; do echo online > /sys/devices/system/memory/memory$memcount/state; done

Check the new memory is active:

[root@... ~]# free -m
total used free shared buffers cached
Mem: 8045 854 7190 0 8 87

Friday, October 12, 2012

Pre-allocating RAM on a Virtualbox guest

One of the problems with guest VMs in Virtualbox is that RAM is dynamically allocated by the host as the guest uses an increasing quantity of memory. This is fine if you run many VMs which do not always need all the memory allocated to them at once; but you will find it inconvenient if you really need to allocate specific (read: large) amounts of memory to a VM. This is especially true when running a guest on a host like windows 7, where superfetch has already allocated chunks of memory to different applications. When the guest requests more memory, the OS does not give it, because it is only available to other programs which may not necessarily need it.

-- The solution then...

Forcing VirtualBox to "grab" all of the guest's memory at startup is possible. This will attempt to allocate the entire guest memory from the host. If that memory isn't really free, then the guest will not start at all.

There is a boolean flag which can be set as follows:

VBoxManage setextradata <VM NAME> VBoxInternal/RamPreAlloc 1

That's all.

Documentation: Unfortunately there is very poor documentation on the "VBoxInternal" keyset; probably because it is mainly used for development purposes and not necessarily "real" day-to-day use. Consider this then, a hack.

The possible variables that can be set using VBoxInternal seem to be defined in the following C++ header file:

http://www.virtualbox.org/svn/vbox/trunk/src/VBox/VMM/include/PGMInternal.h

Here is another interesting file worth reading:

http://www.virtualbox.org/svn/vbox/trunk/include/VBox/err.h

I would be very careful with attempting to set any of these without a good understanding of the VirtualBox codebase.

Wednesday, October 10, 2012

Adding or resizing scsi device at runtime on linux guest VM

It's always a tricky thing to add more disk space to a VM when you don't want to take down it's services to reboot the box. Linux doesn't need to be rebooted just to know that a new device has been attached, but how do you get it to recognize it?

As per a great blog post by Vivek Gite from NixCraft on: http://www.cyberciti.biz/tips/vmware-add-a-new-hard-disk-without-rebooting-guest.html

The basic command to re-scan scsi devices is:

echo "- - -" > /sys/class/scsi_host/<host#>/scan

fdisk -l

tail -f /var/log/message

If you do not have a new disk, but instead have increased the size of an existing disk, then you must rescan the device. Note that this may not be appropriate if the device is used for the /boot partition.

# echo "1" > /sys/class/scsi_device/<device>/device/rescan

Friday, August 10, 2012

Parse Apache Logs by Date Range

Parsing apache logs by date and by date ranges can be fairly simple with a bit of awk scripting.

We use AWK to compare date fields in order to retrieve specific rows.

The date fields between access logs and error logs can vary, so some adjustments are needed:

Note that the date field is contained within a single column in the access_log file, therefore we can do a comparison against a single column. Typically column #4.

AWK Date Range for access logs:

$ awk '$4>"[09/Aug/2012:15:00:" && $4<"[09/Aug/2012:15:59:"' ./access_log | less

The date field in the error log is in separate columns. Example: [Thu Aug 09 15:30:... That in itself is four columns. They must be combined in order to be compared effectively. To do this, we assign a combination of those four columns to two variables: $from and $two. We then use these two variables for the comparison. See below:

AWK Date Range for error logs:

$ awk '$from>"[Thu Aug 09 15:30:00" && $to<"[Thu Aug 09 15:59:00"' from='$1 " " $2 " " $3 " " $4' to='$1 " " $2 " " $3 " " $4' ./error_log | less

Tuesday, June 26, 2012

yum - Error: database disk image is malformed

If you've ever gotten this cryptic error using yum, you'll find that it's very difficult to pinpoint the cause. The message itself, "database disk image is malformed," refers to a corrupted sqlite file. However, the RPM and YUM systems use a variety of different such files; therefore finding the right one can be difficult.

The best thing to do, is to start by attempting to fix this using the available command line tool:

# yum clean all

This should solve the problem in most cases. If the problem continues, perhaps the RPM database files are corrupted. One of my previous articles talks about rebuilding these, but I will go over it again here:

The database files are located in "/var/lib/rpm" and are named __db.001 __db.002 etc... etc...

Delete those files:

# rm -f /var/lib/rpm/__db*

Rebuild the database:

# rpm --rebuilddb

Then try to clean the yum cache as per the above command and try your yum command again. If this continues to fail, try deleting your yum cache manually:

# rm -Rf /var/cache/yum

Now try the command again. This should have gotten rid of the last sqlite files yum could possibly use. The command should be able to rebuild all the databases correctly at this point.

Tuesday, April 17, 2012

Force Virtualbox Display in Fullscreen

I have a dual display setup with virtualbox. In fullscreen I found that virtualbox would switched onto my smaller display.

This can be adjusted.

1) Switch to fullscreen with Virtualbox

2) Use the combination of your "host" key + "home" key. (right-ctrl + home) in my case.

3) Go to the "view" menu item -> "virtual screen"

4) Select the appropriate monitor.

5) If the size of your display varies,  you will have to go out of fullscreen and back into fullscreen to re-adjust the workspace to the correct size.

Thursday, April 12, 2012

yum crashed with python import error - fixed corrupted rpm database

I ran into an interesting error while trying to find out, which repository one of my installed packages came from. Before we proceed, let me explain that I had need to use the "repoquery" utility which is part of the "yum-utils" package. I proceeded to install this one as I did not yet have it. The installation worked perfectly well and did not install any dependencies.

# yum install yum-utils -y

Using the repoquery command, I attempted to query which repo my php53 package came from:

$ repoquery -i php53

Instead of getting the information I wanted, the script crashed with the following error:

File "/usr/bin/repoquery", line 38, in

from yum.i18n import to_unicode

cannot import name to_unicode 

Googling around, many blog posts and site talked about the yum installation being broken. That may well be, but I decided to see if maybe there was something a bit simpler at play here. First I needed to find out which yum packages are already installed on this system:

# rpm -qa | grep -i yum

yum-fastestmirror-1.1.16-16.el5.centos

yum-3.2.22-37.el5.centos

yum-updatesd-0.9-2.el5

yum-utils-1.1.16-16.el5.centos

yum-metadata-parser-1.1.2-2.el5

NOTE: The listed version numbers do not reflect the original numbers I had on my system. I went ahead and attempted to update all of the above listed packages:

# yum update yum yum-fastestmirror yum-updatesd yum-metadata-parser

Yum only found that yum and yum-fastestmirror needed to be updated. I proceeded with the update.

After the update, the repoquery command started working perfectly well. However, a completely unrelated problem occurred which I will discuss very briefly.

# repoquery -i php53

Instead of getting a nice listing of information from the RPM database, I received an error message saying the database was corrupted. The next step then was to rebuild the dabase.

The database files are located in "/var/lib/rpm" and are named __db.001 __db.002 etc... etc...

Delete those files:

# rm -f /var/lib/rpm/__db*

Rebuild the database:

# rpm -vv --rebuilddb

Once completed, I tried the repoquery command once again:

# repoquery -i php53

Name        : php53

Version     : 5.3.3

Release     : 1.el5_7.6

Architecture: x86_64

Size        : 3591477

Packager    : None

Group       : Development/Languages

URL         : http://www.php.net/

Repository  : updates

Summary     : PHP scripting language for creating dynamic web sites

Description :

PHP is an HTML-embedded scripting language. PHP attempts to make it

easy for developers to write dynamically generated webpages. PHP also

offers built-in database integration for several commercial and

non-commercial database management systems, so writing a

database-enabled webpage with PHP is fairly simple. The most common

use of PHP coding is probably as a replacement for CGI scripts.

The php package contains the module which adds support for the PHP

language to Apache HTTP Server.

Monday, April 2, 2012

Convert Blocks to Bytes

Very easily 

=========== CONVERT FROM BYTES TO BLOCKS =========== 

For example:

100 Megabytes = 1024 x 1024 x 100

---------------------------------

1 Megabyte  = 1024 x 1024 x 1

1 Kilobyte  = 1024 x 1

1 Byte      = 1

A block is a set quantity of bytes.  For example, a mounted partition could have blocks of 4096 bytes or 4k.

Typically however, quotas will have a block size of 1024 bytes ( 1k ).

To calculate the quantity of blocks for quotas you should allocate blocks using the following formula:

Allocate 100Mb of space to a user:

1024 x 1024 x 100 = 104,857,600 (in bytes) = 100Mb

Divide the number of bytes by the block size:

Block size of 4096: 104,857,600 / 4096 =  25,600 blocks

Block size of 1024: 104,857,600 / 1024 = 102,400 blocks

25,600  blocks = 100Mb if your block size is 4096 (unlikely for quotas)

102,400 blocks = 100mb if your block size is 1024

=========== CONVERT FROM BLOCKS TO BYTES ===========

Now determine how many Megabytes a quota of 262144 blocks equals:

1) 262144 multiplied by the size of the blocks:

262144 x 1024 = 268,435,456 (number of total bytes)

2) Since we are dealing with megabytes, divide by 1024 x 1024

( 268,435,456 / 1048576 ) = 256

Thursday, February 16, 2012

Add date to Bash History

In order to add a date stamp to your bash history add the following two lines to your .bash_profile:

HISTTIMEFORMAT='%F %T '

export HISTTIMEFORMAT

Alternativelly, you can set this variable globally and have all history files keep the data by setting these two lines in a file under the /etc/profile.d directory.

echo "HISTTIMEFORMAT='%F %T '" > /etc/profile.d/histtimestamps.sh

# echo "export HISTTIMEFORMAT" >> /etc/profile.d/histtimestamps.sh

# chmod +x /etc/profile.d/histtimestamps.sh

Your history will look like this:

...

902  2012-02-16 09:50:33 cd /var/log

903  2012-02-16 09:50:33 ll

904  2012-02-16 09:50:33 ls -lat | sort -t

905  2012-02-16 09:50:33 ls -lat

...

Monday, February 13, 2012

Build an SELinux policy from an audit log

Often certain commands in linux will simply fail without any messages in /var/log/messages, or seemingly anywhere else... where we usually check. However, if you look at the selinux audit logs, sometimes the error messages are there. /var/log/audit/audit.log.

For example, every once in a while after a kernel update, I can't use the talk program. It simply says the connection is being refused by the other use. Since I already know Selinux is the culprit I grep the logs:

grep -i talkd /var/log/audit/audit.log

The result:

type=AVC msg=audit(1329155365.865:143): avc:  denied  { open } for  pid=5631 comm="in.ntalkd" name="1" dev=devpts ino=4 scontext=system_u:system_r:ktalkd_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:user_devpts_t:s0 tclass=chr_file

type=SYSCALL msg=audit(1329155365.865:143): arch=c000003e syscall=2 success=no exit=-13 a0=7fffc83c0eb8 a1=101 a2=7fffc83c0ec3 a3=7fffc83c0690 items=0 ppid=5630 pid=5631 auid=4294967295 uid=99 gid=5 euid=99 suid=99 fsuid=99 egid=5 sgid=5 fsgid=5 tty=(none) ses=4294967295 comm="in.ntalkd" exe="/usr/sbin/in.ntalkd" subj=system_u:system_r:ktalkd_t:s0-s0:c0.c1023 key=(null)

Two entries showing that talk is denied. If you really want to authorize this process grep the tail end of the file and use audit2allow to generate a policy file that will allow this.

tail /var/log/audit/audit.log | grep '1329155365.865:143' | audit2allow -M talkpolicy

audit2allow generates a talkpolicy.pp file and will also give you instructions on how to activate it. That would be:

semodule -i talkpolicy.pp

This will take a minute or two and has effectively authorized the blocked program to run.

Sunday, January 1, 2012

Redhat / Centos / Fedora VM Clone - Nic gets the wrong eth name.

After creating a clone and changing the MAC address I reboot the machine and find that the eth0 doesn't show up when I run the ifconfig command. This is to be expected of course as I have changed the MAC on the VM guest, but not on the OS's configuration. So:

# vim /etc/sysconfig/network-scripts/ifcfg-eth0

Modify the MAC Address here. Reboot the VM guest.

During the reboot eth0 does not get configured. Running ifconfig still returns nothing except the Loopback device. If I run:

# service network restart

Bringing up interface eth0: Device eth0 does not seem to be present, delaying initialization.

What is going on? I know the ethernet device is there, it's a VM. Let's see what dmesg says.

# dmesg | grep -i eth

e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
udev: renamed network interface eth0 to eth1

Huh? Why? Well as it turns out, udev had also assigned the MAC address to it's own configuration before the clone took place. So now that the clone is done and we have a new MAC, udev knows about it, but it thinks its a new NIC entirely and doesn't have any configuration for it. Take a look at the udev file for the nic.

vim /etc/udev/rules.d/70-persistent-net.rules

You will notice probably 3 entries. Normally there is only one entry in this file. The first time the system booted up, udev detected a conflict with the MAC so it created a new entry. It was named NAME="eth1" with the new guest VM's real MAC.

The first entry is the original entry. And the last entry exists because we changed the MAC address in the ifcfg-eth0 file. The only valid entry now is the last one, so all we have to do is delete entry 1 and 2. Reboot the system.

Try ifconfig and automagically eth0 is there with an IP address. Provided you configured DHCP or a static IP that is.