LSI MegaRaid – Quick configuration without web based BIOS on SuperMicro IPMIs

Setting up a hardware raid configuration remotely can be a rather tiresome task if you do not happen to perform such on a daily basis. Using LSI’s web based client via IPMI can be an extremely nerve wrecking pastime, and since we have run into this scenario a couple of times already, we thought it might be useful to post a very quick getting started guide.

Getting into the configuration shell

  1. Connect to your server’s IPMI and launch the remote console.
  2. Boot the server and enter the preboot client shell (CLI) when the option is presented on screen.

How to create a virtual drive

In the shell, list all your physical disks this way:

-PDList -aALL

This will list all physical devices on all adapters. There will be a lot of scrollage, so you may need to capture (if pause does not work – which is often the case…) the screen with the inbuilt video capture functionality. You will need the following values:

  1. adapter ID (usually 0 or 1)
  2. enclosure ID (often 252, but not always)
  3. slot ID (per disk, varies)

Next, create a virtual disk with your desired RAID setup, e.g. a simple Raid 1 can be done this way (assuming we have the controller’s ID as 0, enclosure ID 252, and the two disks in question on slots 2 and 3 respectively):

-CfgLdAdd -r1 [252:2,252:3] -a0

You are good to go now – reboot, and your OS should pick up the virtual disk now.

Further reading and information:

Software Raid: readding disks after replacement

As we have recently seen it on a client’s server we manage, hosted in another ISP’s DC: Below please see a quick typical way to get a broken (degraded) software raid array back to healthy and clean:

First, check the raid status as such:

# cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0]
md0 : active raid1 sdb1[1] sda1[0]
4198976 blocks [3/2] [UU_]

md1 : active raid1 sdb2[1] sda2[0]
2104448 blocks [3/2] [UU_]

md2 : active raid5 sdc3[2] sdb3[1] sda3[0]
2917660800 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

This shows that md0 and md1 have issues.

Go into details for each problem device to get some additional information:

# /sbin/mdadm –detail /dev/md0

/dev/md0:
Version : 0.90
Creation Time : Tue Jun 8 07:47:09 2010
Raid Level : raid1
Array Size : 4198976 (4.00 GiB 4.30 GB)
Used Dev Size : 4198976 (4.00 GiB 4.30 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Wed Aug 21 06:24:07 2013
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : ...
Events : 0.3000

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 0 0 2 removed

Removed devices no longer need to be removed with mdadm (this could be done with mdadm –manage /dev/mdX –fail /dev/sdYY). Repeat for /dev/md1 or any other failed / degraded raid devices.

Replace the failed drive (hot swap, recable, etc. – may have to reboot the machine for the OS to recognise the new disk even in a hot swap scenario).

After replacing the bad disk, clone the partition table to the new disk:

# /sbin/sfdisk -d /dev/sda | sfdisk /dev/sdY

Then, re-add the new devices to the array:

# /sbin/mdadm –manage /dev/mdX –add /dev/sdYY

Repeat for all failed devices – the software raid system will automatically schedule the respective resyncs.

Please also see: http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array

Hard Disks: Bad Block HowTo

Hardware fails, that is a fact. Nowadays, hard drives are rather reliable, but nevertheless every now and then we will see drives failing or at least having hiccups. Using smartcl/smartd to monitor disks is a good thing, below we will discuss how some lesser issues can be handled without actually having to reboot the system – it is still up to a sys admin’s own discretion to judge circumstances correctly and evaluate whether disk errors encountered are a one time incident or indicative of an entirely failing disk.

Let’s have a look at a typical smartcl -a DEVICE output:

# smartctl -a /dev/sda

...
ID# ATTRIBUTE_NAME          .... RAW_VALUE
197 Current_Pending_Sector  .... 2
...

OK, so we have an oops here. Time to find out what is going on:

# smartctl –test=short /dev/sda

This will take a very short time, a couple of minutes at most, e.g.:

Please wait 2 minutes for test to complete.
Test will complete after Sat Feb  2 16:25:10 2013

Now, with a current pending sector count > 0 we will most likely have an ouch after the test completes:

Num  ..    Status                  Remaining  ..  LBA_of_first_error
...
# 2  ..    Completed: read failure 90%        ..  1825221261
...

LBA counts sectors in units of 512 bytes and starts at 0, so we now need to find out where 1825221261 is actually located:

# fdisk -lu /dev/sda

will display some information about the device in question:

   Device Boot      Start         End      Blocks   Id  System
...
/dev/sda3        31641600  1953523711   960941056   83  Linux
...

Obviously, 1825221261 is on /dev/sda3, thus. Now we need to determine the file system block for our LBA in question, so we first have to get the block size:

# tune2fs -l /dev/sda3 | grep Block

Block count:              240235264
Block size:               4096
Blocks per group:         32768

OK, 4096 bytes. So, the actual block number will be:

(LBA – PARTITION_START_SECTOR) * (512 / BLOCKSIZE)

In our case, this is:

(1825221261 – 31641600) * (512 / 4096) = 224197457.625

We only need the integer part, the fraction just tells us that we are into the 6th sector out of eight that make up this file system block.

It is good practice to find out which inode/file has been affected by using debugfs (operations can take a while with this tool):

# debugfs

debugfs:  open /dev/sda3
debugfs:  icheck BLOCK (224197457 in our case)
Block   Inode number
224197457       56025154
debugfs:  ncheck 56025154
Inode   Pathname
56025154        /some/path/to/file

Now, if this file isn’t anything crucial, then we can start correcting things now:

# dd if=/dev/zero of=/dev/sda3 bs=4096 count=1 seek=BLOCK
  (224197457 here)
# sync

smartctl -a will now show an updated current pending sector count, and you can re-run a short smartctl test.

Source: http://www.vanderzee.org/bad_blocks_howto

 

Migrating Proxmox KVM to Solus / CentOS KVM

By default, Proxmox creates KVM based VMs on a single disk partition, typically in raw or qcow2 format. Solus, however, uses an LVM based system. So how do you move things over from Proxmox to Solus? Here goes:

  1. Shut down the respective Proxmox VM;
  2. As an additional precaution, make a copy of the Proxmox VM (cp will do);
  3. If the Proxmox VM is not in raw format, you need to convert it using qemu-img:
    qemu-img convert PROXMOX_VM_FILE -O raw OUTPUT_FILE
    Proxmox usually stores the image files under /var/lib/vz/images/ID
  4. Create an empty KVM VM on the Solus node with a disk size at least as large as the raw file of the Proxmox VM (and possibly adjust settings such as driver, PAE, etc.), and keep it shut down;
  5. In the config file (usually under /home/kvm/kvmID) of the newly created Solus VM, check the following line:
    <source file=’/dev/VG_NAME/kvmID_img’/>
    and make a note;
  6. dd the Proxmox raw image over to the Solus node:
    dd if=PROXMOX_VM.raw | ssh [options] user@solus_node ‘dd of=/dev/VG_NAME/kvmID_img’
  7.  Boot the new Solus KVM VM;

IOPS and RAID considerations

IOPS (input/output operations per second) are still – maybe even more so than ever – the most prominent and important metric to measure storage performance. With SSD technology finding its way into affordable, mainstream server solutions, providers are eager to outdo each other offering ever higher IOPS dedicated servers and virtual private servers.

While SSD based servers will perform vastly better than SATA or SAS based ones, especially for random I/O, the type of storage alone isn’t everything. Vendors will often quote performance figures using lab conditions only, i.e. the best possible environment for their own technology. In reality, however, we are facing different conditions – several clients competing for I/O, as well as a wide ranging mix of random reads and writes along with sequential I/O (imagine 20 VPS doing dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync).

Since most providers won’t offer their servers without RAID storage, let’s have a look at how RAID setups impact IOPS then. Read operations will usually not incur any penalty since they can use any disk in the array (total theoretical read IOPS available therefore being the sum of the individual disks’ read IOPS), whereas the same is not true for write operations as we can see from the following table:

RAID level Backend Write IOPS per incoming write request
0 1
1 2
5 4
6 6
10 2

We can see that RAID 0 offers the best write IOPS performance – a single incoming write request will equate to a single backend write request – but we also know that RAID 0 bears the risk of total array loss in case a single disk fails. RAID 1 and 10, the latter being providers’ typical or most advertised choice, offers a decent tradeoff – 2 backend writes per single incoming write. RAID 5 and RAID 6, with their additional, robust setup, bear the largest penalty.

When calculating the effective IOPS, thus, keep in mind the write penalty individual RAID setups come with.

The effective IOPS performance of your array can be estimated using the following formula:

IOPSeff = n * IOPSdisk / ( R% + W% * FRAID )

with n being the number of disks in the array, R and W being the read and write percentage, and F being the RAID write factor tabled above.

We can also calculate the total IOPS performance needed based on an effective IOPS workload and a given RAID setup:

IOPStotal = ( IOPSeff * R% ) + ( IOPSeff * W% )

So if we need 500 effective IOPS, and expect around 25% read, and 75% write operations in a RAID 10 setup, we’d need:

500 * 0.25 + 500 * 0.75 * 2 =  875 total IOPS

i.e. our array would have to support at least 875 total, theoretical IOPS. How many disks/drives does this equate to? Today’s solid state drives will easily be able to handle that, but what about SATA or SAS based RAID arrays? A typical SAS 10k hard disk drive will give you around 100-140 IOPS. That means we will need 8 SAS 10k drives to achieve our desired IOPS performance.

Conclusion:
All RAID levels except RAID 0 have significant impact on your storage array’s IOPS performance. The decision about which RAID level to use is therefore not only a question about redundancy or data protection, but also about resulting performance for your application’s needs:

  1. Evaluate your application’s performance requirements;
  2. Evaluate your application’s redundacy needs;
  3. Decide which RAID setup to use;
  4. Calculate the resulting IOPS performance necessary;

 

Sources:

Calculate IOPS in a storage array by Scott Lowe, TechRepublic, 2/2010
Getting the hang of IOPS by Symantec, 6/2012

 

 

 

Adding disks to Windows VMs under KVM

Reading through various posts on forums and blogs all over the web there are many solutions offered how to add another disk to a Windows VM running under KVM. Below is one solution that worked smoothly for all our nodes running the Solus control panel, with KVM as virtualisation technology:

  1. create a new volume with
    lvcreate -L [INTEGERSIZE]G -n [NEW_VOL_NAME] [VOLUMEGROUPNAME]
  2. edit the vm’s config file (under Solus, this is usually /home/kvm/kvmID/kvmID.xml), and a section below the first disk (assuming hda has already been assigned, we use hdb here for the new disk):
        <disk type='file' device='disk'>
         <source file='/dev/VOLUMEGROUPNAME/NEW_VOL_NAME'/>
         <target dev='hdb' bus='ide'/>
        </disk>
  3. shut down and then boot the vm
  4. log in, and in the storage section of your server administration tool, initialise and format the new disk

NB for Solus: you will have to create a hook and enable advanced config in the control panel, otherwise Solus will overwrite the edited config again. The most basic hook would just hold the production config in a separate file in the same directory, and the hook would ensure that the new file is being used, e.g. from ./hooks/hook_config.sh (must be executable):

#!/bin/sh
mv /home/kvm/kvmID/kvmID.xml /home/kvm/kvmID/kvmID.xml.dist
cp -f /home/kvm/kvmID/kvmID.xml.newdisk /home/kvm/kvmID/kvmID.xml

 

Xen HVM / Solus – Network card / driver issues

Every now and then we run into problems with fully virtualised VMs not recognising their assigned network card. Most often, this happens under Xen HVM and latest Debian/Ubuntu and even CentOS full or netinstall ISOs.

Under Solus CP, there is a very simple fix for this, though the custom config / change of network card does not seem to work properly. Pretty much every Linux distribution should recognise the ne2000 driver:

On the node with the VM having issues, go to /home/xen/vmID and check the vif line in the vmID.cfg file. Take note, and then go to (or create it if it does not exist yet) the hooks directory.

Create an executable file hook_config.sh, and edit it as follows:

#!/bin/sh
grep -Ev ‘vif’ /home/xen/vmID/vmID.cfg > /home/xen/vmID/vmID.cfg.tmp
mv /home/xen/vmID/vmID.cfg.tmp /home/xen/vmID/vmID.cfg
echo “vif        = ['ip=aaa.bbb.ccc.ddd, vifname=vifvmID.0, mac=..., rate=...KB/s, model=e1000']” >> /home/xen/vmID/vmID.cfg

Save it, and reboot your VM. This should let your VM find its network card and allow you to continue with the installation and subsequent production use.

 

Updating CentOS (RHEL, Fedora)

This is just a very concise summary to guide you through the typical update process of a CentOS based Linux server that has no control panel installed on top of it. This post will also appear in our virtual server hosting BLOG:

  1. run yum check-update from the shell.
    This will give you a list of newly available packages for your distribution based on the repositories you have defined. This list will typically not be too long for a well maintained server, unless the distribution itself has just undergone a major update (such as from CentOS 5.7 to 5.8 recently).
  2. check the packages listed and ensure that your currently running applications will still be compatible with the new versions of any packages updated.
  3. make backups of any individual settings you have made for any packages that are going to be updated (httpd.conf, php.ini, etc.). Usually, these will not be touched, but it doesn’t hurt to make sure you have a copy (in addition to the regular backups you should be doing!).
  4. once you have confirmed that everything should still be fine after the update, from the shell, run yum update.
    This will start the update process, and you will actually have to confirm the update before it is really being processed (last chance to say “no”!).
  5. once complete, restart affected services (such as httpd, for example), or reboot your server if vital system packages have been updated (kernel, libc, …)

 

My server crashed…what now?

So now it has finally happened…your server has crashed beyond repair. It won’t boot, or what it boots has little resemblance with what you expect it to come up with, remote console shows a manual file system check is needed, grub cannot find a kernel, your root partition is gone, Windows says it cannot find any disks anymore, and other nightmares you thought could happen to everyone else but not you.

Now what?

First of all: DON’T PANIC!

For those of you who are familiar with Adam’s Hitchhiker’s Guide to the Galaxy this advice sounds more than familiar, and it is in fact the very first action to take. Panic will cloud your mind, and you will take much longer for everything you do than when you do it calmly and even take the time to think twice before you do anything at all.

  1. Assess the situation: Are you able to try fixing it yourself, or are you not familiar enough with the error displayed, or the symptoms coming up?
  2. Do not mess with the system too much: Even if you do not have a managed server and therefore have to have a look yourself first, you are not on site, and you do not have the means to do a hardware repair, apart from the risk of the damage becoming larger the longer the remote actions take, and the more diversified the approach becomes in an attempt to salvage what is left.
  3. Ask your provider to step in: If you have a managed server, they will have to handle it anyway, and depending on the SLA in place, will provide you with a new machine in between, a failover solution, etc. If you do not have a managed server, your provider is still your best guess for actual on site operations as they are the ones who have physical access to the server, and if they are not proficient, they should be able to bring someone in faster than you who can have a look at the machine. If you have hired your own sysadmin (who is not on site either, however), your ISP and your sysadmin can communicate to discuss the best course of action.
  4. In the meantime, have your provider – with or without respective SLA – set up a new server, a replacement VPS, a shared hosting account, in other words, anything that allows you to bring back your site saying “We are performing maintenance / crash recovery / you name it”, i.e. something that brings you back in touch with your customers so they are aware you are on to the situation. Use twitter, facebook, your customer portal (if you have one on another machine), etc. to let your clients know who of them is, and why they are affected.
  5. Depending on the interim solution, get ready to bring your backups back online (you do have them, don’t you?). In a managed environment, your provider will most likely have them, otherwise (and in fact no matter what) you should have an external backup somewhere as well.
  6. Once you have a new production server (or the old one repaired), and have set it up with its operating system, updated it to the latest patchset and security fixes, and brought it to a state that matches the environment before the crash, use your backups and perform data recovery. Do not go live again yet, however:
  7. Test, test, and test again if everything is working according to specs and expectations. Naturally, you will want to be back online as fast as possible. On the other hand, you want to avoid nasty surprises such as inconsistent databases, mismatched orders, invoices, etc. It is up to your judgement to find the right balance here.
  8. Once everything is back to normal, write up an Incident Report and send it out to all customers who were affected by the outage, and handle compensation as per your own SLA and TOS.

 

Managed or not?

We have had a similar post back in July 20211 (cf. here) , so why are we bringing this up again? Recently, we have had a large surge in two categories of orders: unmanaged lowend VPS (256MB memory and the likes, for use as DNS server, etc.), and fully managed servers.

Customers are increasingly aware of the need to back up their sites with a well managed server. Typically, the managed option often only extends to managing the operating system (and possibly hardware) of the server in question, i.e. updating the operating system with the latest security patches (something that an “intelligent” control panel, such as cPanel, can handle itself, mostly), latest package upgrades, and generally making sure the server works as intended.

In most cases, managed does not, however, cover application issues. This, however, is a crucial point: You as the customer need to be sure that the server administration side of your enterprise speaks the same language as the application development side. Nothing is worse than an eager sysadmin updating a software package without consulting the developers who, incidentally, depend on the older version for the entire site to run smoothly. With nowadays globalisation, this can cause you additional grief – often your developers are from a different company than your ISP, and often they (as is natural) will defend themselves in taking the blame. It will leave you and your enterprise crippled or hindered.

What do we advise?

  1. Don’t save money on a sysadmin.
  2. Make sure your sysadmin talks to your developers and understands what they need.
  3. Make sure your sysadmin has a basic understanding of your application in case of emergencies.
  4. Make sure your staff: your sysadmin and developers coordinate updates and upgrades.
  5. Make sure you have a working test environment where you can run the updates and upgrades in a sandbox to see if afterwards things still work the way they are expected to run.
  6. Have a teamleader coordinate your sysadmin(s) and developer(s), or take this role upon yourself.

How much is it going to cost you?

Fully managed packages vary in cost – the normal sysadmin packages that deal with the operating system only will up your budget by anything between £ 20 to £200 per month, if you want the sysadmin to be an integral part of your team and support your application as well (in terms of coordinated server management), then the price will be more to the higher end of that range, but might possibly also include some support for the application as well already.

Who to hire?

Get someone with experience. There are sysadmins out there who have decades of experience and know the do’s and dont’s, and there are sysadmins who consider themselves divine just because they have been “into linux for 2 years”. A sysadmin is not someone who jumps at the first sight of an available package upgrade and yum installs 200 dependencies to claim he has a system up to date. A sysadmin is someone who understands the implications of a) upgrading and b) not upgrading. A sysadmin will weigh these pros and cons and explain them to you before making suggestions as to what to do. A sysadmin is someone you trust to even take this decision off your shoulder so you can run your business instead of having to worry whether the next admin cowboy is going to blow up your server. A sysadmin is someone who knows not only how to keep a system alive, but also how to bring a failed system back to life.

These are just some general guidelines, contact us for further advice, we are happy to help!