Quick-start with Gluster on AWS

I wanted to play around with Gluster a bit, and EC2 has gotten cheap enough that it makes sense to spin up a few instances. My goal is simple: set up Gluster running on two servers in different regions, and see how everything works between them. This is in no way a production-ready guide, or even necessarily good practice. But I found the official guides lacking and confusing. (For reference, they have a Really, Really Quick Start Guide and also one tailored to EC2. Both took some tweaking. Here’s what I did:

  • Start two EC2 instances. I used “Amazon Linux” on a t2.micro, and started one each in Sydney and Oregon. (Using different regions is in no way required; I’m doing that because I’m specifically curious how it will behave in that case.)
  • Configure the security groups from the outset. Every node needs access to every other node on the following ports (this was different for older versins):
    • TCP and UDP 111 (portmap)
    • TCP 49152
    • TCP 24007-24008
  • Create a 5GB (or whatever you like, really) EBS volume for each instance; attach them. This will be our ‘brick’ that Gluster uses.
  • Pop this in /etc/yum.repos.d/glusterfs-epel.repo:
# Place this file in your /etc/yum.repos.d/ directory

[glusterfs-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-6/$basearch/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[glusterfs-noarch-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes.
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-6/noarch
enabled=1
skip_if_unavailable=1
gpgcheck=0

[glusterfs-source-epel]
name=GlusterFS is a clustered file-system capable of scaling to several petabytes. - Source
baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-6/SRPMS
enabled=0
skip_if_unavailable=1
gpgcheck=0
  • sudo yum install glusterfs glusterfs-fuse glusterfs-geo-replication glusterfs-server. This should pull in the necessary dependencies.
  • Now, set up those volumes:
    • sudo fdisk /dev/sdf (or whatever it was attached as); create a partition spanning the disk
    • Create a filesystem on it; I used sudo mkfs.ext4 /dev/sdf1 for now
  • Create a mountpoint; mount
sudo mkdir -p /exports/sdf1
sudo mount /dev/sdf1 /exports/sdf1
sudo mkdir -p /exports/sdf1/brick
  • Edit /etc/fstab and add the appropriate line, like:
/dev/sdf1   /exports/sdf1 ext4  defaults        0   0
  • Start gluster on each node; sudo service glusterd start
  • Peer detection… This tripped me up big time. The only way I got this to work was by creating fake hostnames for each box in /etc/hosts. I used gluster01 and gluster02 for names. /etc/hosts/ mapped gluster01 to 127.0.0.1 on gluster01, and gluster02 to 127.0.0.1 on gluster02. Then, from one node (it doesn’t matter which), detect the other by its hostname that you just used. You don’t need to repeat from the other host; they’ll see each other.
  • Create the volume, replication level 2 (nodes), one one of them:
sudo gluster volume create test1 rep 2 gluster01:/exports/sdf1/brick gluster02:/exports/sdf1/brick

This will fail miserably if you didn’t get the hostname thing right. You can’t do it by public IP, and you can’t directly use localhost. If it works right, you’ll see “volume create: test1: success: please start the volume to access data”. So, let’s do that.

  • sudo gluster volume start test1 (you can then inspect it with sudo gluster volume status)
  • Now, mount it. On each box; sudo mkdir /mnt/storage. Then, on each box, mount it with a reference to one of the Gluster nodes: sudo mount -t glusterfs gluster01:test1 /mnt/storage (you could use gluster01:test1 or gluster02:test1, either will find the right volume). This may take a big if it’s going across oceans. * cd into /mnt/storage, create a file, and see that it appears on the other. Magic!

Please keep in mind that this was the bare minimum for a cobbled-together test, and is surely not a good production setup.

Also, replicating Gluster between Sydney and Oregon is horribly slow. Don’t do that! Even when it’s not across continents, Gluster doesn’t do well across a WAN.

3 thoughts on “Quick-start with Gluster on AWS

  1. Unfortunately this didn’t work for me with Amazon Linux 2015.09 I got a dependency error:

    Error: Package: glusterfs-libs-3.7.6-1.el7.x86_64 (glusterfs-epel)
    Requires: rsyslog-mmjsonparse
    Error: Package: glusterfs-server-3.7.6-1.el7.x86_64 (glusterfs-epel)
    Requires: liburcu-bp.so.1()(64bit)
    Error: Package: glusterfs-server-3.7.6-1.el7.x86_64 (glusterfs-epel)
    Requires: systemd-units
    Error: Package: glusterfs-server-3.7.6-1.el7.x86_64 (glusterfs-epel)
    Requires: liburcu-cds.so.1()(64bit)
    You could try using --skip-broken to work around the problem
    You could try running: rpm -Va --nofiles --nodigest

    Going to try with Ubuntu instead as that seems to be the only solution!

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax