Claudio Borges

Technical blog about Linux, BSD, Mac OS X, Games and etc.

Archive for the ‘raid 1’ tag

Dynamic disk partitioning with RedHat Kickstart

without comments

Hi folks, this time I’ll show how to create a dynamic disk partitioning scheme using RedHat Kickstart.

TL;DR: I’ve added the entire Kickstart file on my GitHub account. feel free to download it if you don’t want to read the entire article.

Overview

Many system administrators would prefer to use an automated installation method to install RedHat/CentOS Linux on their machines. RedHat created the Kickstart method to do that. Using it, you can create a single file containing the answers to all the questions that would normally be asked during a typical installation.

Different than Debian Preseed, a Kickstart file is precise and easy to understand. But both of them are text files containing a list of items, each one identified by a keyword.

Your first Kickstart file

When you finished a RedHat/CentOS installation, if you check the root directory, you will see the file /root/anaconda-ks.cfg. It was created based on the options that you selected during installation. Let’s say that it is your first Kickstart File and you can use it to create a new installation (just by changing a few options).

You have two ways to modify or create a Kickstart file: the first one is to use your preferred text editor (vi/vim) or use a tool like Kickstart Configurator (system-config-kickstart) for example.

While the general principles of Kickstart installations tend to stay the same, the commands and options can change between major releases of RedHat/CentOS. If you are creating a Kickstart from scratch or if you are modifying an old one to use it in a new installation, it is highly recommended to verify its syntax, you can use the Pykickstart (available in the RedHat/CentOS repository) to do that.

Pre-install section

The %pre section defines actions or commands that will be perform before installation (useful to automate things). You can use it to download a file or you can generate a new one. For example, let’s use it to create the file “/tmp/partitioning.txt” with the partitioning scheme that will be used in the installation:

cat >> /tmp/partitioning.txt <<EOF
clearpart --all --initlabel
bootloader --location=mbr --driveorder=sda

part /boot --fstype="xfs"   --size=512 --ondisk=sda
part pv.00 --fstype="lvmpv" --ondisk=sda

volgroup VolGroup00 pv.00

logvol swap --fstype="swap" --size=4096 --name=lv_swap --vgname=VolGroup00
logvol /    --fstype="xfs"  --size=3072 --name=lv_root --vgname=VolGroup00
logvol /tmp --fstype="xfs"  --size=512  --name=lv_tmp  --vgname=VolGroup00
logvol /var --fstype="xfs"  --size=1024 --name=lv_var  --vgname=VolGroup00
EOF
%end

PS: Although you can access the network in the %pre section, the name service has not been configured at this point, in other words, just IP addresses will work.

The –interpreter= allows you to specify a different scripting language, such as Python. Any scripting language available on the system can be used. In most cases, these are /bin/sh, /bin/bash and /usr/bin/python. Append the %pre line at the beginning of the script to be able to use it. The following example changes the pre-installation script behavior.

%pre --interpreter=/usr/bin/python
--- Python script omitted --
%end

Include section

The %include section is used to include the contents of another file in the kickstart file as though the contents were part of the Kickstart file. Generally this option it used to specify a generated or downloaded file by a pre-install script.

To include a file in the Kickstart file, just use the %include section as follows:

%include /path/to/file

Partitioning rules

Like I said, the %pre section will run immediately before the partitioner starts. It will count the number of disks, define the RAID level and the partitioning scheme. It will use the following rule to do that:

1 disk = LVM Physical device
2 disks = RAID 1 + LVM Physical device
3 disks = RAID 5 + LVM Physical device
4 disks or more = RAID 10 + LVM Physical device

Why Sofware RAID?

Perhaps are you thinking why someone would use software instead of using Hardware RAID? The answer is simple: Everything depends on your point of view. Hardware RAID (integrated into the motherboard or a separate controller) is similar in performance to the software RAID. But, it has certain features not available in software RAID and that are never implemented in low-cost controllers, such as caching, hot-swapping, and battery backup. On the other hand, software RAID is implemented by the operating system, it is cheap, easy and a fairly versatile option. However, it has certain limitations, for example, it has no hotswap. In my case, I’m using Software RAID (sometimes) because, I have a few servers with low-cost controllers.

Both of them (Hardware and Software RAID) aren’t perfect. You can still lose the array to a controller failure or operator error.

Swap size and Partitioning scheme

The SWAP size will be calculated based on the amount of RAM and I’ve used the following rule:

if RAM < 2GB then SWAP = 2x physical RAM
if RAM > 2GB or MEM < 8GB then SWAP = Equal to the amount of RAM
if RAM > 8GB then SWAP = At least 4 GB

And the partitioning scheme will be as follows:

/dev/mapper/VolGroup00/lv_root - 3Gb   - /    xfs
/dev/mapper/VolGroup00/lv_tmp  - 512Mb - /tmp xfs
/dev/mapper/VolGroup00/lv_var  - 1Gb   - /var xfs

Kickstart file

The following Kickstart file is used in my production environment. Feel free to use it or change it if you want:

%pre
#!/bin/bash
# Use RAID+LVM or just LVM to partition the disk

# Partitioning scheme:
#
# /dev/md0 - 512Mb - /boot xfs
# /dev/md1 - raid device + LVM VolGroup00 (if you have 2 or more disks)
# /dev/mapper/VolGroup00/lv_swap - the swap is calculated over the amount
# of RAM in the system, ex:
# if RAM < 2GB then SWAP = 2x physical RAM
# if RAM > 2GB or MEM < 8GB then SWAP = Equal to the amount of RAM
# if RAM > 8GB then SWAP = At least 4 GB
# /dev/mapper/VolGroup00/lv_root - 3Gb   - /    xfs
# /dev/mapper/VolGroup00/lv_tmp  - 512Mb - /tmp xfs
# /dev/mapper/VolGroup00/lv_var  - 1Gb   - /var xfs

# Get the disks
COUNT=0
for DISK in $(awk '{if ($NF ~ "^(s|h)d|cciss" && $NF !~ "((s|h)d|c.d.)[a-z][0-9]$") print $4}' /proc/partitions); do
    DEVS[${COUNT}]="${DISK}"
    DISKS[${COUNT}]="${DISK//\/dev\/}"
    let COUNT++
done

# Define the RAID level
if [ ${COUNT} -eq "1" ]; then
    LEVEL=-1
elif [ ${COUNT} -eq "2" ]; then
    LEVEL=1
elif [ ${COUNT} -eq "3" ]; then
    LEVEL=5
elif [ ${COUNT} -ge "4" ]; then
    LEVEL=10
fi

# Calculate the SWAP size over the amount of RAM
MEM=$(($(sed -n 's/^MemTotal: \+\([0-9]*\) kB/\1/p' /proc/meminfo) / 1024))
if [ "${MEM}" -lt "2048" ]; then
    SWAP=$((MEM * 2))
elif [ "${MEM}" -gt "2048" ] || [ "${MEM}" -le "8192" ]; then
    SWAP=${MEM}
elif [ "${MEM}" -ge "8192" ]; then
    SWAP=4096
fi

# 
# Create the RAID + LVM (if the system has two disks or more)
if [ ${LEVEL} -ge "1" ]; then
    x=${#DEVS[@]}
    DEVS=${DEVS[@]:0}
    DISKS=${DISKS[@]:0}
    echo "ignoredisk --only-use=${DISKS// /,}"                 > /tmp/partitioning.txt
    echo "clearpart --all --initlabel --drives=${DISKS// /,}" >> /tmp/partitioning.txt
    for ((i=0; i < ${#DEVS[@]}; i++)); do
        echo "part raid.0${i} --fstype=\"mdmember\" --size=512 --ondisk=${DISK[$i]}" >> /tmp/partitioning.txt
        echo "part raid.0${x} --fstype=\"mdmember\" --grow     --ondisk=${DISK[$i]}" >> /tmp/partitioning.txt
        RAIDPARTS1[$i]="raid.0${i}"
        RAIDPARTS2[$x]="raid.0${x}"
        let x++
    done
    echo "raid /boot --device=0 --fstype=\"xfs\"   --level=raid${LEVEL} ${RAIDPARTS1[@]:0}" >> /tmp/partitioning.txt
    echo "raid pv.00 --device=1 --fstype=\"lvmpv\" --level=raid${LEVEL} ${RAIDPARTS2[@]:0}" >> /tmp/partitioning.txt
# Otherwise, it will use just LVM
else
    echo "part /boot --fstype=\"xfs\"   --size=512"            >> /tmp/partitioning.txt
    echo "part pv.00 --fstype=\"lvmpv\" --ondisk=${DISK[@]:0}" >> /tmp/partitioning.txt
fi

# Define the volume group and logical volumes
cat >> /tmp/partitioning.txt <<EOF
volgroup VolGroup00 pv.00
logvol swap --fstype="swap" --size=${SWAP} --name=lv_swap --vgname=VolGroup00
logvol /    --fstype="xfs"  --size=3072 --name=lv_root --vgname=VolGroup00
logvol /tmp --fstype="xfs"  --size=512 --name=lv_tmp --vgname=VolGroup00
logvol /var --fstype="xfs"  --size=1024 --name=lv_var --vgname=VolGroup00
EOF
%end

Conclusion

As you can see, Kickstart files are pretty easy and powerful. I tried to be generic and cover all kinds of disk devices in my script. For example, in the past, I had around 10 Kickstart files to handle different hardware brands and now I have just one.

So, that is all for now folks. If you have any questions about this article, feel free to ask me.

Written by but3k4

October 6th, 2015 at 8:46 am

Dynamic disk partitioning with Debian preseed

without comments

Hi folks, this article shows how to create a dynamic disk partitioning scheme using Debian preseed.

TL;DR: I’ve added the partitioning file (and other preseed files) on my GitHub account. feel free to download it if you don’t want to read the entire article.

Overview

Debian preseed provides a way to set answers to questions asked during the installation process, without having to manually enter the answers while the installation is running. This makes it possible to fully automate most types of installations and it even offers some features not available during normal installations. Most of these questions are asked by Debian installer and can be preseeded by setting the answers in the debconf database.

The Debian installer has a directive called early_command. Using it, for example, you can run a Bourne Shell script (it isn’t Bash) or create a sh script inside of it. It’s powerful, but it has a problem. It doesn’t understand multiple lines, I mean, you need to create your rules in just one (logical) line.

PS: To organize and create a beautiful script, I’ve separated it in multiple lines (adding a backslash “\” at the end of each line).

Partitioning rules

The early_command will run immediately before the partitioner starts. It will count the number of disks, define the RAID level and the partitioning scheme. To do that, it will use the following rule:

1 disk = LVM Physical device
2 disks = RAID 1 + LVM Physical device
3 disks = RAID 5 + LVM Physical device
4 disks or more = RAID 10 + LVM Physical device

Partitioning methods

PartMan uses 4 different methods to partition the disks:

  • regular – use the usual partition types for your architecture
  • lvm – use LVM to partition the disk
  • crypto – use LVM within an encrypted partition
  • raid – use RAID partition the disk (with or without LVM)

To define the method, use the option “partman-auto/method”, for example:

partman-auto/method "raid"

When the method is defined, the PartMan needs to know what disks will be used in the partitioning scheme. If the system has only one disk the installer will default to using that, but otherwise, the device name must be given in traditional, non-devfs format (so e.g. /dev/hda or /dev/sda, and not e.g. dev/discs/disc0/disc). If the system has two disks (or more), you need to specify all of them:

partman-auto/disk "/dev/sda /dev/sdb /dev/sdc /dev/sdd"

In addition, it’s necessary to specify how the disks will be used in the RAID setup. Remember to use the correct partition numbers for logical partitions and that RAID levels 0, 1, 5, 6 and 10 are supported and the devices are separated using “#”. The parameters are:

<raidtype> <devcount> <sparecount> <fstype> <mountpoint> <devices> <sparedevices>

Using four disks (RAID 10 + LVM), for example, the values are defined as follow:

partman-auto-raid/recipe "10 4 0 lvm - /dev/sda1#/dev/sdb1#/dev/sdc1#/dev/sdd1 ."

Why Sofware RAID?

Perhaps are you thinking why someone would use software instead of using Hardware RAID? The answer is simple: Everything depends on your point of view. Hardware RAID (integrated into the motherboard or a separate controller) is similar in performance to the software RAID. But, it has certain features not available in software RAID and that are never implemented in low-cost controllers, such as caching, hot-swapping, and battery backup. On the other hand, software RAID is implemented by the operating system, it is cheap, easy and a fairly versatile option. However, it has certain limitations, for example, it has no hotswap. In my case, I’m using Software RAID (sometimes) because, I have a few servers with low-cost controllers.

Both of them (Hardware and Software RAID) aren’t perfect. You can still lose the array to a controller failure or operator error.

Partitioning Recipe

After choosing RAID or LVM (or RAID + LVM), it is necessary to define the recipe. You can choose predefined recipes, create your own recipe file and point at it (if you have a way to get it into the environment) or you can create your own recipe in a preseed file. In my case, I’m using just the last option (I won’t describe the others, you can read the PartMan’s documentation to understand all of them).

The recipe is defined by the option “partman-auto/expert_recipe”. Using RAID, you need to specify what devices will be included in the RAID and/or what partitions will be logical volumes. Like I said, the early_command understands just one (logical) line. But, I’ve separated it in multiple lines. Take a look in the code below:

partman-auto/expert_recipe string "multiraid :: \
    0 0 0 raid \
        $lvmignore{ } \
        $primary { } \
        method{ raid } . \

The “multiraid” in the first line is the name of the recipe (it is used by another PartMan option).

Note that second line has the values “0 0 0 raid”, they are respectively the minimum size, priority and maximum size (I’ll talk about that in the next paragraph). The other options are:

  • $lvmignore{ } – The RAID partitions are tagged as “lvmignore”, while the LVM logical volumes as “defaultignore” and “lvmok”.
  • $primary{ } – used to define the partition as primary.
  • method { } – used to define the type of formatting. You can use swap, raid, or format.
  • Now, take a look in the lines below:

        512 40960 15360 ext4 \
            $defaultignore{ } \
            $lvmok{ } \
            lv_name{ lv_root } \
            method{ format } \
            format{ } \
            use_filesystem{ } \
            filesystem{ ext4 } \
            mountpoint{ / } . \
    

    PS: The “.” in the last line is a separator between hard drives.

    Like I said, in Debian preseed files, each partition has a minimum size a priority value and maximum size. As you can see, the first line has the values “512 40960 15360 ext4”. These values are used by PartMan to create a smart partitioning scheme, they mean:

    • 512 – the minimum size of partition in Mb.
    • 40960 – the priority if it and other listed partitions are vying for space on the disk (this is compared with the priorities of the other partitions).
    • 15360 – the maximum size of partition in Mb.
    • ext4 – the filesystem type.

    The other options used in the previous example are:

    • $defaultignore{ } – used to avoid a partition definition so that it is ignored in the default case. That is to say it will be valid in the LVM case.
    • $lvmok{ } – used to define a LVM logical volume.
    • lv_name{ lv_root } – set the LVM logical volume name.
    • method{ format } – set to format the partition. (use “keep” to not format or “swap” for swap partitions).
    • format{ } – also needed so the partition will be formatted.
    • use_filesystem{ } – this partition will have a filesystem on it (it won’t be swap, LVM, etc)
    • filesystem{ ext4 } – what filesystem it gets
    • mountpoint{ / } – where it will be mounted

    Alternatively, you can define the volume group, if you don’t specify it, the PartMan will use a default name:

    partman-auto-lvm/new_vg_name string VolGroup00
    

    Swap size and Partitioning scheme

    The SWAP size will be calculated based on the amount of RAM and I’ve used the following rule:

    if RAM < 2GB then SWAP = 2x physical RAM
    if RAM > 2GB or MEM < 8GB then SWAP = Equal to the amount of RAM
    if RAM > 8GB then SWAP = At least 4 GB
    

    And the partitioning scheme will be as follows:

    /dev/mapper/VolGroup00/lv_root   - 512Mb to 15Gb - /    ext4
    /dev/mapper/VolGroup00/lv_tmp    - 256Mb to 4Gb  - /tmp ext4
    /dev/mapper/VolGroup00/lv_var    - 512Mb to 15Gb - /var ext4
    /dev/mapper/VolGroup00/lv_delete - all remaining space
    

    Did you understand why I have created a logical volume called lv_delete? Of course not right? So, the Debian installer has a bug. I don’t know why it allocates all remaining space to the last logical volume (just when you’re using preseed files). Because of that, I’ve defined the lv_delete and before the install finishes, it will be deleted, giving back this space to the volume group.

    Preseed file

    The following preseed file is used in my production environment. Feel free to use it or change it if you want:

    ### Partitioning
    #
    # Use RAID+LVM or just LVM to partition the disk
    
    # Partitioning scheme:
    #
    # /dev/md0 - RAID device / LVM Physical volume (if you have 2 or more disks)
    # /dev/mapper/VolGroup00/lv_swap - the swap is calculated over the amount
    # of RAM in the system, ex:
    # if RAM < 2GB then SWAP = 2x physical RAM
    # if RAM > 2GB or MEM < 8GB then SWAP = Equal to the amount of RAM
    # if RAM > 8GB then SWAP = At least 4 GB
    # /dev/mapper/VolGroup00/lv_root   - 512Mb to 15Gb   - /    ext4
    # /dev/mapper/VolGroup00/lv_tmp    - 256Mb to 4Gb    - /tmp ext4
    # /dev/mapper/VolGroup00/lv_var    - 512Mb to 15Gb   - /var ext4
    # /dev/mapper/VolGroup00/lv_delete - all remaining space. This logical
    # volume was created because the Debian installer has a bug. It allocates
    # all remaining space to the last logical volume. The late_command will
    # execute a lvremove VolGroup00/lv_delete and give back this space to the
    # volume group.
    
    # The options used in the partitioning can be found in the following url:
    # https://wikitech.wikimedia.org/wiki/PartMan
    
    # These commands will run immediately before the partitioner starts.
    d-i partman/early_command string \
        COUNT=0; \
        SPARE=0; \
        for DISK in $(list-devices disk); do \
            DISKS="${DISKS} ${DISK}"; \
            if [ "$(echo ${DISK} | cut -d'/' -f3)" = "cciss" ]; then \
                DEVS="${DEVS}${DISK}p1#"; \
                SPARE_DEV="${DISK}p1"; \
            else \
                DEVS="${DEVS}${DISK}1#"; \
                SPARE_DEV="${DISK}1"; \
            fi; \
            COUNT=$((COUNT + 1)); \
        done; \
        DISKS=$(echo ${DISKS} | sed "s/^ //g"); \
        DEVS=$(echo ${DEVS} | sed "s/#$//g"); \
        if [ "${COUNT}" -eq "1" ]; then \
            RAID="-1"; \
        elif [ "${COUNT}" -eq "2" ]; then \
            RAID="1"; \
        elif [ "${COUNT}" -eq "3" ]; then \
            RAID="5"; \
        elif [ "${COUNT}" -ge "4" ]; then \
            RAID="10"; \
            SPARE=$((COUNT % 2));\
        fi; \
        if [ ${SPARE} -eq "1" ]; then \
            COUNT=$((COUNT - 1)); \
            DEVS=$(echo ${DEVS} | sed -e "s|${SPARE_DEV}||g;s/#$//g"); \
        else \
            SPARE_DEV=""; \
        fi; \
        MEM=$(($(sed -n 's/^MemTotal: \+\([0-9]*\) kB/\1/p' /proc/meminfo) / 1024)); \
        if [ "${MEM}" -lt "2048" ]; then \
            SWAP=$((MEM * 2)); \
        elif [ "${MEM}" -gt "2048" ] || [ "${MEM}" -le "8192" ]; then \
            SWAP=${MEM}; \
        elif [ "${MEM}" -ge "8192" ]; then \
            SWAP=4096; \
        fi; \
        debconf-set partman-auto/disk "$DISKS"; \
        if [ "${RAID}" -ge "1" ]; then \
            debconf-set partman-auto/method "raid"; \
            debconf-set partman-auto-raid/recipe "${RAID} ${COUNT} ${SPARE} lvm - ${DEVS} ${SPARE_DEV} ."; \
            debconf-set partman-auto/expert_recipe "multiraid :: \
                0 0 0 raid \
                    \$lvmignore{ } \
                    \$primary { } \
                    method{ raid } . \
                256 256 ${SWAP} linux-swap \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_swap } \
                    method{ swap } \
                    format{ } . \
                512 40960 15360 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_root } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ / } . \
                256 5120 4096 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_tmp } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ /tmp } . \
                512 40960 15360 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_var } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ /var } . \
                1024 1024 1024 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_delete } ."; \
        else \
            debconf-set partman-auto/method "lvm"; \
            debconf-set partman-auto/expert_recipe "root :: \
                256 256 ${SWAP} linux-swap \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_swap } \
                    method{ swap } \
                    format{ } . \
                512 40960 15360 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_root } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ / } . \
                256 5120 4096 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_tmp } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ /tmp } . \
                512 40960 15360 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_var } \
                    method{ format } \
                    format{ } \
                    use_filesystem{ } \
                    filesystem{ ext4 } \
                    mountpoint{ /var } . \
                1024 1024 1024 ext4 \
                    \$defaultignore{ } \
                    \$lvmok{ } \
                    lv_name{ lv_delete } ."; \
        fi
    
    # Install grub in the first device (assuming it is not a USB stick)
    d-i grub-installer/bootdev string default
    
    # Continue installation without /boot partition?
    d-i partman-auto-lvm/no_boot boolean true
    
    # Name of the volume group for the new system
    d-i partman-auto-lvm/new_vg_name string VolGroup00
    
    # Remove existing software RAID partitions?
    d-i partman-md/device_remove_md boolean true
    
    # Remove existing logical volume data?
    d-i partman-lvm/device_remove_lvm boolean true
    
    # Unable to automatically remove LVM data
    d-i partman-lvm/device_remove_lvm_span boolean true
    
    # Dummy template for preseeding unavailable questions
    d-i partman-auto/purge_lvm_from_device boolean true
    
    # Write the changes to the storage devices and configure RAID?
    d-i partman-md/confirm boolean true
    d-i partman-md/confirm_nooverwrite boolean true
    
    # Write the changes to disks and configure LVM?
    d-i partman-lvm/confirm boolean true
    d-i partman-lvm/confirm_nooverwrite boolean true
    
    # Write the changes to disks?
    d-i partman/confirm boolean true
    d-i partman/confirm_nooverwrite boolean true
    
    # Finish partitioning and write changes to disk
    d-i partman/choose_partition select finish
    
    # This command is run just before the install finishes, and,
    # it will remove the lv_delete
    d-i preseed/late_command string lvremove -f /dev/VolGroup00/lv_delete > /dev/null 2>&1
    

    The last line in the preseed file is the late_command. It works similarily to early_command, the only difference between them is the late_command runs just before the install finishes. In my case, it will remove the lv_delete, but you can use it to do other tasks, such as, for example, install packages or customize the environment before the first boot.

    Final Thoughts

    Probably you know that nothing is perfect, right? Debian preseed isn’t different. Its syntax isn’t easy (annoying is the right word), it has a weird bug (like I described previously) and its documentation is horrible (Debian users, I’m so sorry about that, but the Kickstart’s documentation is easier and more helpful). Besides, you need to be careful with it (especially with the early_command and the late_command), because, if you forget a minimal detail, the Debian installer will generate a generic error.

    So, that is all for now folks. If you have any questions about this article, feel free to ask me.

Written by but3k4

September 30th, 2015 at 11:28 am

GlusterFS – Um sistema de arquivos para alta disponibilidade

with 2 comments

Depois de um longo tempo sem publicar nada no blog, estou de volta com um assunto interessante: GlusterFS.

Neste artigo irei mostrar como instalar e configurar o GlusterFS para criar um sistema de armazenamento de alta disponibilidade utilizando 2 servidores. Ambos os servidores serão client e server e cada servidor será espelho do outro onde os arquivos serão replicados automaticamente entre eles, ou seja, uma espécie de raid 1 via rede.

GlusterFS é um sistema de arquivos distribuído, capaz de escalar vários petabytes. Ele trabalha sob infiniband RDMA ou tcp/ip. Os desenvolvedores recomendam os sistemas de arquivos Ext3 e Ext4. Outros sistemas de arquivos, como ZFS, ReiserFS, btrfs, JFS, também funcionam, mas não foram amplamente testados. XFS tem vários problemas de desempenho devido a sua implementação atributos estendidos, se você optar por usar XFS seu desempenho utilizando Gluster será reduzido em pelo menos 60%.

Para seu funcionamento, você não precisa de nada em especial, pode utilizar seu hardware já existente, como por exemplo servidores com discos Sata/Sata-II ou ISCSI/SaS.

Os dados dos servidores a serem utilizados neste artigo são:

Servidor 01: 192.168.0.10
Servidor 02: 192.168.0.11
diretório a ser compartilhado: /var/www

É interessante você adicionar as seguintes entradas no /etc/hosts de cada servidor:

192.168.0.10      servidor01
192.168.0.11      servidor02

Como em artigos anteriores, este aqui também é baseado em debian. Os pacotes que iremos utilizar são glusterfs-client e glusterfs-server e a instalação segue o mesmo procedimento:

apt-get install glusterfs-client glusterfs-server

Depois de instalar os pacotes, entre no diretório /etc/glusterfs, nele você verá os seguintes arquivos:

glusterfs.vol
glusterfsd.vol

O primeiro arquivo é responsável pela configuração do client e o segundo do server. Como os servidores serão client e server ao mesmo tempo, a configuração destes arquivos precisa ser idêntica em ambas as máquinas.

Renomeie os arquivos e adicione .default ao final de cada um deles:

cd /etc/glusterfs
mv glusterfs.vol glusterfs.vol.default
mv glusterfsd.vol glusterfsd.vol.default

Crie o arquivo /etc/glusterfs/glusterfs.vol com o seguinte conteúdo:

# /etc/glusterfs client configuration file
#
volume client01
  type protocol/client
  option transport-type tcp/client
  option remote-host servidor01
  option remote-subvolume brick
end-volume

volume client02
  type protocol/client
  option transport-type tcp/client
  option remote-host servidor02
  option remote-subvolume brick
end-volume

volume replicate
  type cluster/replicate
  subvolumes client01 client02
end-volume

volume writeback
  type performance/write-behind
  option aggregate-size 1MB
  subvolumes replicate
end-volume

volume cache
  type performance/io-cache
  option page-size 512MB
  subvolumes writeback
end-volume

Crie o arquivo /etc/glusterfs/glusterfsd.vol com o seguinte conteúdo:

# /etc/glusterfs server configuration file
#
volume posix
  type storage/posix
  option directory /var/www
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume brick
  type performance/io-threads
  option thread-count 8
  subvolumes locks
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick.allow 192.168.0.10,192.168.0.11
  subvolumes brick
end-volume

Para entender melhor as opções usadas nas configurações, sugiro dar uma lida na página de translators.

Com os arquivos configurados, inicie o daemon com o seguinte comando:

/etc/init.d/glusterfs-server start

Adicione a seguinte entrada no /etc/fstab de ambos os servidores:

/etc/glusterfs/glusterfs.vol    /var/lib/glusterfs      glusterfs    defaults      0   0

Crie o diretório /var/glusterfs e monte o diretório:

mkdir /var/glusterfs
mount -a

Agora com tudo pronto em ambos os servidores, vamos realizar os seguintes testes:

- No servidor 01: Salve alguns arquivos no /var/glusterfs.
- Conecte no servidor 02 e veja se os arquivos estão lá.
- Execute um reboot no servidor 01.
- Veja se tudo está ok no servidor 02.
- Salve alguns arquivos no servidor 02.
- Quando o servidor 01 voltar, verifique no /var/glusterfs se os arquivos que você salvou quando ele estava fora foram replicados.
- Repita o procedimento mudando a ordem do servidores.

Você deve estar se perguntando porque estou me baseando no diretório /var/glusterfs e não no /var/www, isto porque para a replicação funcionar, os dados precisam ser gravados no /var/glusterfs.

E isto é tudo. Estando todos os testes ok, você agora tem um raid 1 vai rede :).

Written by but3k4

April 18th, 2011 at 9:23 pm