Pieter de Rijk

Home » Archives for Pieter de Rijk

Customizing CoreOS images

Posted on: May 1, 2014 | By: Pieter de Rijk – Comments Off

For quite a while I’m impressed by the Docker and CoreOS projects and it has been quite a while on my todo list to look into it…

Since I’ve access to some playground with some old Workstations… I decided to start playing around with it, using PXE boot (this was already set up on that environment).

So I followed the instructions for PXE as described on the CoreOS PXE Boot page, although it kept complaining about an “invalid or corrupt kernel image“, while the checksums (MD5/SHA1) were OK. Since the TFT server is running on RHEL5, I did had an old version of pxelinux, so after the downloading the latest syslinux binary from kernel.org the system booted.

After I had the system booted the system ‘alerted’ me on the fact that the test-environment is using an MTU of 9000 (JumboFrames)…

This could not be fixed by the cloud-config configuration over HTTP method as far as I consider… because the cloud-config is loaded by the OS and therefore it requires up-and-running network-interface (with a correct MTU set);

So I had to modify the CoreOS initrd, to update the MTU in /usr/lib/systemd/network/99-default.link:

MTUBytes=9000

So the we need to unpack the initial ramdisk.

Unpacking the CoreOS Ramdisk

Step 1) Create a temporary location in /tmp:

# mkdir -p /tmp/coreos/{squashfs,initrd,original,custom}

Step 2) Download or copy the ramdisk /tmp/coreos/original:

# cd /tmp/coreos/original/
# wget http://storage.core-os.net/coreos/amd64-usr/alpha/coreos_production_pxe_image.cpio.gz

Step 3) unzip the ramdisk:

# gunzip coreos_production_pxe_image.cpio.gz
# cd ../initrd
# cpio -id < ../original/coreos_production_pxe_image.cpio

Step 4) Unsquash the squash filesystem and move the original-container:

# cd ../squashfs/
# unsquashfs ../initrd/usr.squashfs
# mv ../initrd/usr.squashfs ../usr.squashfs-original

Please note… that you need at least the squashFS 4.0 tools… but you can download the source and compile the binaries (at least it works on RHEL5).

And now you can access the unpacked image via /tmp/coreos/squashfs/squashfs-root and perform modifications, but please use the path minus the usr-prefix and relative to /tmp/coreos/squashfs/squashfs-root. So summarized:

/usr/lib/systemd/network/99-default.link can be found in:
/tmp/coreos/squashfs/squashfs-root/lib/systemd/network/99-default.link

So hack around and apply modification where needed.

Packing the CoreOS Customized Ramdisk

Now we have to repack the ramdisk, so we can load it…

Step 1) Repack the squashfs

# cd /tmp/coreos/squashfs
# mksquashfs squashfs-root/ ../initrd/usr.squashfs -noappend -always-use-fragments

Please ensure you use squashfs tools 4.0!

Step 2) Make it all a cpio archive and zip it

# cd /tmp/coreos/initrd
# find . | cpio -o -H newc | gzip > ../custom/coreos_CUSTOM_pxe_image.cpio.gz

Now boot it and use the custom image as initrd.

GlusterFS rebalance weight based (WIP)

Posted on: April 17, 2014 | By: Pieter de Rijk – Comments Off

Please note I’ve to perform further testing to validate if it works as expected… but I would like to at least share it…

On one of the GlusterFS instances I manage I have a wide variety of disk (brick) sizes.

5x 1TB
2x 2TB
1x 3TB

Although GlusterFS is currently not taking different disk sizes into count for ‘rebalancing’ the data.

After some searching on the Internet I noticed that there is a proposal to built it in into the GlusterFS Code (check this proposal on the Gluster Community).

So far the steps are actually pretty ‘simple’…

Step 1)

Download the python scripts (as root)

# mkdir -p $HOME/glusterfs-weighted-rebalance
# cd $HOME/glusterfs-weighted-rebalance
# wget https://raw.githubusercontent.com/gluster/glusterfs/master/extras/rebalance.py
https://raw.githubusercontent.com/gluster/glusterfs/master/extras/volfilter.py

Step 2)

Run the python script

# python rebalance.py -l glusterfs-cluster Backup_Volume
Here are the xattr values for your size-weighted layout:
Backup_Volume-client-0: 0x00000002000000000000000015557a94
Backup_Volume-client-1: 0x000000020000000015557a952aaaf523
Backup_Volume-client-2: 0x00000002000000002aaaf52440006fb2
Backup_Volume-client-3: 0x000000020000000040006fb35555ea41
Backup_Volume-client-4: 0x00000002000000005555ea426aab64d0
Backup_Volume-client-5: 0x00000002000000006aab64d19555c505
Backup_Volume-client-6: 0x00000002000000009555c506d5559fca
Backup_Volume-client-7: 0x0000000200000000d5559fcbffffffff
The following subvolumes are still mounted:
Backup_Volume-client-0 on /tmp/tmp2oBBLB/brick0
Backup_Volume-client-1 on /tmp/tmp2oBBLB/brick1
Backup_Volume-client-2 on /tmp/tmp2oBBLB/brick2
Backup_Volume-client-3 on /tmp/tmp2oBBLB/brick3
Backup_Volume-client-4 on /tmp/tmp2oBBLB/brick4
Backup_Volume-client-5 on /tmp/tmp2oBBLB/brick5
Backup_Volume-client-6 on /tmp/tmp2oBBLB/brick6
Backup_Volume-client-7 on /tmp/tmp2oBBLB/brick7
Don’t forget to clean up when you’re done.

Step 3)

Set the xattr trusted.glusterfs.size-weighted per brick to the values mentioned above:

# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000000000000015557a94 /tmp/tmp2oBBLB/brick0
# setfattr -n trusted.glusterfs.size-weighted -v 0x000000020000000015557a952aaaf523 /tmp/tmp2oBBLB/brick1
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000002aaaf52440006fb2 /tmp/tmp2oBBLB/brick2
# setfattr -n trusted.glusterfs.size-weighted -v 0x000000020000000040006fb35555ea41 /tmp/tmp2oBBLB/brick3
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000005555ea426aab64d0 /tmp/tmp2oBBLB/brick4
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000006aab64d19555c505 /tmp/tmp2oBBLB/brick5
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000009555c506d5559fca /tmp/tmp2oBBLB/brick6
# setfattr -n trusted.glusterfs.size-weighted -v 0x0000000200000000d5559fcbffffffff /tmp/tmp2oBBLB/brick7

Step 4)

Unmount the temporary mounted volumes that were mounted by rebalance.py:

# umount /tmp/tmp2oBBLB/*

Step 5)

Start Gluster to rebalance the volumes:

# gluster volume rebalance Backup_Volume start

Robots Social Network

Posted on: April 14, 2014 | By: Pieter de Rijk – Comments Off

Also check out Playing With Lego Robots Makes Your Employees Better At Their Jobs.

Transparant Bonnet by LandRover

Posted on: April 10, 2014 | By: Pieter de Rijk – Comments Off

Pretty amazing (experimental) technology by Land Rover… Cameras under the hood and a live video feed onto the windshield.

“Where good ideas come from…” by Steven Johnson

Posted on: April 4, 2014 | By: Pieter de Rijk – Comments Off

Interesting video by Steven Johnson about ‘where good ideas come from’

Cloudflare CDN

Posted on: February 19, 2014 | By: Pieter de Rijk – Comments Off

Since yesterday I enabled CloudFlare’s CDN (the Free Plan) without the use of their DNS infra.

I simply ‘looked up’ with nslookup my entries on their DNS servers and update my zone-files accordingly. 🙂

Investigate disk usage

Posted on: September 16, 2013 | By: Pieter de Rijk – Comments Off

Recently I had to investigate the usage of a very big volume… with a lot of data-files, owned by several users.

I started with “agedu”, but somehow I was not able to get the information I needed.. so I started using find with stat and put everything into MySQL.

So the first step was to do the following find command:

# find /nfs/bigfiler -exec stat –format=”%F:%n:%s:%U:%u:%G:%g:%X:%Y:%Z” {} ; > /scratch/big-filer.info

Create a table in MySQL:

CREATE TABLE `pieter.bigfiler_content` (

  `fileid` int(10) unsigned NOT NULL AUTO_INCREMENT,

  `file_type` char(32) NOT NULL,

  `filename` varchar(512) NOT NULL,

  `size` int(11) NOT NULL,

  `user` char(16) DEFAULT NULL,

  `uid` char(16) DEFAULT NULL,

  `groupname` char(16) DEFAULT NULL,

  `gid` char(16) DEFAULT NULL,

  `time_access` datetime DEFAULT NULL,

  `time_mod` datetime DEFAULT NULL,

  `time_change` datetime DEFAULT NULL,

  PRIMARY KEY (`fileid`),

  KEY `idx_file_type` (`file_type`) USING BTREE,

  KEY `idx_user` (`user`) USING BTREE

)

Load the data into MySQL:

LOAD DATA INFILE '/scratch/big-filer.info'

  INTO TABLE pieter.bigfiler_content

  FIELDS TERMINATED BY ':'

  LINES TERMINATED BY 'n'

  (file_type, filename, size, user, uid, groupname, gid, @time_access, @time_mod, @time_changed)

  SET time_access = FROM_UNIXTIME(@time_access),

      time_mod = FROM_UNIXTIME(@time_mod),

      time_change = FROM_UNIXTIME(@time_change);

And now you can run nice queries to analyse the data 😀

GlusterFS availability/healing of a Volume

Posted on: August 9, 2013 | By: Pieter de Rijk – Comments Off

I performed some investigation on availability/healing of GlusterFS Volumes.

In a “lab” environment I tested two types:

Distributed Volume

Replicated Volume

With the tests I wanted to figure out a few questions:

What is the availability of files if one of the nodes dies/goes offline

How is will GlusterFS recover of the node comes back online

Please note, that in the tests, the nodes were pulled offline manual, while the client was not connected to the node that goes offline. For a failover to a working node, you can use tools like ucarp for “failing over”.

Replicated Volume

Created a simple replicated volume across 3 nodes:

Then Node 1 dies/goes offline

All the files will be available for reading, since they are spread over all the nodes. Now you can also write data to the volume while node01 is offline.

Once Node01 comes back online, GlusterFS will “resync” the node:

Distributed Volume

We also created a simple distributed volume using 3 nodes:

And (again) node01 dies:

The files A and D are not available, since they are living on node01, the other files are still available. You can also write to the volume, but the files will be written on the nodes that are online (so not balanced across the 3 nodes):

Once Node01 is back online, the files A and D will become available again.

But, the volume is out of balance, so a rebalance should be initiated to get the replicated volume in a good shape:

As you can see there are pros and cons of replicated and distributed volumes, but you can also combine these:

But I haven’t test this set up in the lab… but can guess what will happen when a node die. :-D

Replace a failing brick in a replicated GlusterFS Volume

Posted on: August 9, 2013 | By: Pieter de Rijk – Comments Off

On one of the servers with GlusterFS, I use USB-Disks in a replicated GlusterFS volume. Although once in a while a disk just “dies” and using “replace brick” command is not an option.

So I figured out the following steps:

Stop the GlusterFS Daemon

Remove the faulty brick

Configure the new brick to mount on the same mountpoint as the old one

Start GlusterFS Daemon

Trigger a self-heal on the GlusterFS Volume

Get coffee/..

The new brick will contain the data

Please note, this only works for replicated volumes!

GlusterFS 10 minute guide on CentOS/RHEL/Fedora

Posted on: April 15, 2013 | By: Pieter de Rijk – Comments Off

At home I set up a small 3 node GlusterFS Cluster based on CentOS 6.4 with one client with Fedora 16.

On the client and the nodes iptables and SELinux is disabled.

The GlusterFS package is installed using the EPEL repository.

Preparing the nodes

The nodes have been kick-started using the following %packages in the kickstart:

%packages
@core
@server-policy

After enabling the EPEL repository you run the following commands on all nodes:

yum -y install glusterfs glusterfs-server
mkdir -p /mnt/glusterfs/test/{balanced,replicated}
service glusterd start
chkconfig glusterd on

Now on one of the nodes we bind all the systems to the “cluster”:

[root@node01 ~]# gluster peer status
No peers present
[root@node01 ~]# gluster peer probe node01
Probe on localhost not needed
[root@node01 ~]# gluster peer probe node02
Probe successful
[root@node01 ~]# gluster peer probe node03
Probe successful
[root@node01 ~]# gluster peer status
Number of Peers: 2

Hostname: node02
Uuid: 6d5d5101-5a8e-4057-a854-2aac49ce22ca
State: Peer in Cluster (Connected)

Hostname: node03
Uuid: f651c972-7519-4b74-b610-f1f2f4db31ef
State: Peer in Cluster (Connected)

Create the volumes

We create two volumes, one replicated and one distributed.

Replicated volume

root@node01 ~]# gluster volume create VolReplica replica 3 transport tcp
node01:/mnt/glusterfs/test/replicated
node02:/mnt/glusterfs/test/replicated node03:/mnt/glusterfs/test/replicated
Creation of volume VolReplica has been successful. Please start the volume to access data.
[root@node01 ~]# gluster volume start VolReplica
Starting volume VolReplica has been successful
[root@node01 ~]# gluster volume info VolReplica

Volume Name: VolReplica
Type: Replicate
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: node01:/mnt/glusterfs/test/replicated
Brick2: node02:/mnt/glusterfs/test/replicated
Brick3: node03:/mnt/glusterfs/test/replicated

Balanced (distributed) volume

[root@node01 ~]# gluster volume create VolBalanced transport tcp
node01:/mnt/glusterfs/test/balanced
node02:/mnt/glusterfs/test/balanced node03:/mnt/glusterfs/test/balanced
[root@node01 ~]# gluster volume start VolBalanced
[root@node01 ~]# gluster volume info VolBalanced

Volume Name: VolBalanced
Type: Distribute
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: node01:/mnt/glusterfs/test/balanced
Brick2: node02:/mnt/glusterfs/test/balanced
Brick3: node03:/mnt/glusterfs/test/balanced

Configure the client(s)

You can easily use one of the nodes also as client, but I nominated my (old) laptop with FC 16 as GlusterFS client.

First ensure the EPEL repositories are enabled, and then install the packages:

[root@laptoppie ~]# yum -y install glusterfs-fuse

create the (target) mount points:

[root@laptoppie ~]# mkdir -p /mnt/glusterclient/{replicated,balanced}

And mount the filesystems:

[root@laptoppie ~]# mount -t glusterfs node01:VolReplica /mnt/glusterclient/replicated
[root@laptoppie ~]# df -h /mnt/glusterclient/replicated
Filesystem         Size Used Avail Use% Mounted on
node01:VolReplica   37G 924M   35G   3% /mnt/glusterclient/replicated
[root@laptoppie ~]# mount -t glusterfs node01:VolBalanced /mnt/glusterclient/balanced/
[root@laptoppie ~]# df -h /mnt/glusterclient/balanced/
Filesystem          Size Used Avail Use% Mounted on
node01:VolBalanced 111G 2.8G 103G   3% /mnt/glusterclient/balanced