Contribution to the TensorFlow Project

It has been a while ago that I posted something on my weblog… sorry for that… Although I’ve been very busy the last year (not a real excuse not to post, but does explain it). Recently I’ve been involved in working with TensorFlow, the Machine …

New blog software

And finally I found the time to install WordPress for my weblog. As you can see, the design has been changed. I hope that I will in the future give you a lot of nice information on the things I do. The old blog is …

Courage for safety

This video was shown in during an internal course about safety within the company I work for… It let me start thinking about safety, not that I’m working off-shore… but it can happen everywhere, including a normal office area!

The video is originally in English, but found it with dutch subtitles as well. 

Courage For Safety [Woodside] from arboTV on Vimeo.

GlusterFS rebalance weight based (WIP)

Please note I’ve to perform further testing to validate if it works as expected… but I would like to at least share it… 

On one of the GlusterFS instances I manage I have a wide variety of disk (brick) sizes.

5x 1TB
2x 2TB
1x 3TB

Although GlusterFS is currently not taking different disk sizes into count for ‘rebalancing’ the data.

After some searching on the Internet I noticed that there is a proposal to built it in into the GlusterFS Code (check this proposal on the Gluster Community).

So far the steps are actually pretty ‘simple’…

Step 1)

Download the python scripts (as root)

# mkdir -p $HOME/glusterfs-weighted-rebalance
# cd $HOME/glusterfs-weighted-rebalance
# wget https://raw.githubusercontent.com/gluster/glusterfs/master/extras/rebalance.py 
https://raw.githubusercontent.com/gluster/glusterfs/master/extras/volfilter.py

Step 2)

Run the python script

# python rebalance.py -l glusterfs-cluster Backup_Volume
Here are the xattr values for your size-weighted layout:
Backup_Volume-client-0: 0x00000002000000000000000015557a94
Backup_Volume-client-1: 0x000000020000000015557a952aaaf523
Backup_Volume-client-2: 0x00000002000000002aaaf52440006fb2
Backup_Volume-client-3: 0x000000020000000040006fb35555ea41
Backup_Volume-client-4: 0x00000002000000005555ea426aab64d0
Backup_Volume-client-5: 0x00000002000000006aab64d19555c505
Backup_Volume-client-6: 0x00000002000000009555c506d5559fca
Backup_Volume-client-7: 0x0000000200000000d5559fcbffffffff
The following subvolumes are still mounted:
Backup_Volume-client-0 on /tmp/tmp2oBBLB/brick0
Backup_Volume-client-1 on /tmp/tmp2oBBLB/brick1
Backup_Volume-client-2 on /tmp/tmp2oBBLB/brick2
Backup_Volume-client-3 on /tmp/tmp2oBBLB/brick3
Backup_Volume-client-4 on /tmp/tmp2oBBLB/brick4
Backup_Volume-client-5 on /tmp/tmp2oBBLB/brick5
Backup_Volume-client-6 on /tmp/tmp2oBBLB/brick6
Backup_Volume-client-7 on /tmp/tmp2oBBLB/brick7
Don’t forget to clean up when you’re done.

Step 3)

Set the xattr trusted.glusterfs.size-weighted per brick to the values mentioned above:

# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000000000000015557a94 /tmp/tmp2oBBLB/brick0
# setfattr -n trusted.glusterfs.size-weighted -v 0x000000020000000015557a952aaaf523 /tmp/tmp2oBBLB/brick1
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000002aaaf52440006fb2 /tmp/tmp2oBBLB/brick2
# setfattr -n trusted.glusterfs.size-weighted -v 0x000000020000000040006fb35555ea41 /tmp/tmp2oBBLB/brick3
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000005555ea426aab64d0 /tmp/tmp2oBBLB/brick4
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000006aab64d19555c505 /tmp/tmp2oBBLB/brick5
# setfattr -n trusted.glusterfs.size-weighted -v 0x00000002000000009555c506d5559fca /tmp/tmp2oBBLB/brick6
# setfattr -n trusted.glusterfs.size-weighted -v 0x0000000200000000d5559fcbffffffff /tmp/tmp2oBBLB/brick7

Step 4)

Unmount the temporary mounted volumes that were mounted by rebalance.py:

# umount /tmp/tmp2oBBLB/*

Step 5) 

Start Gluster to rebalance the volumes:

gluster volume rebalance Backup_Volume start

Cloudflare CDN

Since yesterday I enabled CloudFlare’s CDN (the Free Plan) without the use of their DNS infra.

I simply ‘looked up’ with nslookup my entries on their DNS servers and update my zone-files accordingly. 🙂 

GlusterFS availability/healing of a Volume

I performed some investigation on availability/healing of GlusterFS Volumes.

In a “lab” environment I tested two types:

  1. Distributed Volume
  2. Replicated Volume

With the tests I wanted to figure out a few questions:

  1. What is the availability of files if one of the nodes dies/goes offline
  2. How is will GlusterFS recover of the node comes back online

Please note, that in the tests, the nodes were pulled offline manual, while the client was not connected to the node that goes offline. For a failover to a working node, you can use tools like ucarp for “failing over”.

    Replicated Volume

    Created a simple replicated volume across 3 nodes:

    Then Node 1 dies/goes offline

    All the files will be available for reading, since they are spread over all the nodes. Now you can also write data to the volume while node01 is offline.

    Once Node01 comes back online, GlusterFS will “resync” the node:

    Distributed Volume

    We also created a simple distributed volume using 3 nodes:

    And (again) node01 dies:

    The files A and D are not available, since they are living on node01, the other files are still available. You can also write to the volume, but the files will be written on the nodes that are online (so not balanced across the 3 nodes):

    Once Node01 is back online, the files A and D will become available again.

    But, the volume is out of balance, so a rebalance should be initiated to get the replicated volume in a good shape:

    As you can see there are pros and cons of replicated and distributed volumes, but you can also combine these:

    But I haven’t test this set up in the lab… but can guess what will happen when a node die. :-D