Getting started with CEPH RGW

In this post, I like to capture rough notes and the commands I used to get minimal CEPH RGW cluster configured on my fedora-28 VM

OS version:

# cat /etc/redhat-release
Fedora release 28 (Twenty Eight)

System requirements:

  • >25 GB storage space where ceph source code needs to be built
  • preferably >8GB ram

Installing manually from source code:

To build ceph code:

Source: http://docs.ceph.com/docs/mimic/install/build-ceph/

  • git clone https://github.com/ceph/ceph.git
  • cd ceph/
  • git status
  • git submodule update –force –init –recursive
  • ./install-deps.sh
  • ./do_cmake.sh
  • cd build/
  • make
  • make install
  • ln -s /usr/local/lib64/librado* /usr/lib64/

To start ceph cluster locally on VM:

  • cd /../ceph/build
  • MON=1 OSD=1 MDS=1 MGR=1 RGW=1 ..//src/vstart.sh -n -d
    • creates one of each services listed
  • ps aux | grep ceph

Now create rgw users/sub-users:

Source:- (http://docs.ceph.com/docs/mimic/radosgw/admin/)

We need to create users for S3 access and sub-users for SWIFT.

  • radosgw-admin user create –uid=rgwuser –display-name=”RGW user”
    • by default access-key and secret key get generated. Check user info
  • radosgw-admin subuser create –uid=rgwuser –subuser=rgwuser:swift –access=full
  • radosgw-admin caps add –uid=rgwuser –caps=”users=*;buckets=*”
    • (adds capabilities)
  • radosgw-admin user info –uid=rgwuser
    • to check the user/subuser info

To test s3 access: (using python-boto)

Source: http://docs.ceph.com/docs/mimic/radosgw/s3/python/

  • dnf install python-boto
  •  Below is sample script for various functionalities –
import boto
import boto.s3.connection

access_key = "ZGX5BVGID059T7DJLM0S"
secret_key = "koe8DFgGwNk5sQTRVxHaDiEgYQDVj8XVXMdZ4ULd"

boto.config.add_section('s3')
boto.config.set('s3', 'use-sigv4', 'True')

#create connection
conn = boto.connect_s3(
aws_access_key_id = access_key,
aws_secret_access_key = secret_key,
host = 's3.localhost',
port = 8000,
is_secure=False,
calling_format = boto.s3.connection.OrdinaryCallingFormat(),
)

#create bucket
bucket = conn.create_bucket('my-new-bucket')

#list buckets created
for bucket in conn.get_all_buckets():
print "{name}\t{created}".format(
name = bucket.name,
created = bucket.creation_date,
)

#insert file "hello.txt" as object into the bucket
key = bucket.new_key('hello.txt')
key.set_contents_from_string('Hello World!') #write contents

#list objects
for key in bucket.list():
print "{name}\t{size}\t{modified}".format(
name = key.name,
size = key.size,
modified = key.last_modified,
)

#connect to specific bucket and object
my_bucket = conn.get_bucket('my-new-bucket')
hello_key = my_bucket.get_key('hello.txt')

# make hello.txt object public
hello_key.set_canned_acl('public-read')

#generate web-url for the file hello.txt
hello_url = hello_key.generate_url(0, query_auth=False, force_http=True)
print hello_url

#copy hello.txt to local filesystem
key.get_contents_to_filename('/workspace/scripts/sample_hello.txt')

To test via SWIFT access:

Source: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/object_gateway_guide_for_red_hat_enterprise_linux/rgw-configuration-rgw#test-swift-access

  • dnf install python-setuptools
  • easy_install pip
  • pip install –upgrade setuptools
  • pip install –upgrade python-swiftclient
  • To list already created containers:
    • swift -A http://{IP ADDRESS}:{port}/auth/1.0 -U testuser:swift -K ‘{swift_secret_key}’ list
    • eg: 
      #swift -A http://localhost:8000/auth/1.0 -U rgwuser:swift -K '5euZSGD4ikdo4ppVYNiQ1czRXSACHkuO367S41Po' list
      my-new-bucket
  • To insert an object into the container:
# echo " swift access " > swift_file
# swift -A http://localhost:8000/auth/1.0 -U rgwuser:swift -K '5euZSGD4ikdo4ppVYNiQ1czRXSACHkuO367S41Po' upload my-new-bucket swift_file 
swift_file
# swift -A http://localhost:8000/auth/1.0 -U rgwuser:swift -K '5euZSGD4ikdo4ppVYNiQ1czRXSACHkuO367S41Po' list my-new-bucket
hello.txt
swift_file
  • To download an object:
#  swift -A http://localhost:8000/auth/1.0 -U rgwuser:swift -K '5euZSGD4ikdo4ppVYNiQ1czRXSACHkuO367S41Po' download my-new-bucket hello.txt 
hello.txt [auth 0.003s, headers 0.008s, total 0.008s, 0.002 MB/s]
# cat hello.txt 
Hello World!

 

Using Ceph-nano:

Source: http://docs.ceph.com/docs/mimic/install/build-ceph/

  • make sure docker is installed and running.
    #systemctl start docker
  • run sample docker image to check if its working
    #docker run hello-world

    • for me it dint work at first on VM..then updated the packages “dnf update” and restarted VM..it worked then.
  • git clone https://github.com/ceph/cn.git
  • cd cn/
  • make
  • To create a container and configure ceph cluster:
    • ./cn cluster start -d /root/tmp my-first-cluster
    • ./cn cluster status my-first-cluster
    • docker ps
    • docker image ls
    • ./cn cluster ls
    • ./cn cluster enter my-first-cluster
  • To create S3 bucket:
    • ./cn s3 mb my-first-cluster my-buc
    • ./cn s3 put my-first-cluster /etc/passwd my-buc

 

 

 

Posted in Ceph | Tagged | Leave a comment

Quick guide to export Gluster volume via NFS-Ganesha on Fedora

This post is mainly aimed to provide a quick guidance to anyone looking at exporting Gluster volume via a stand-alone NFS-Ganesha server, on any of the Gluster storage pool nodes, without getting into much internals.

We shall use few configurations scripts packaged as part of ‘gluserfs-ganesha’ rpm which were ideally designed to be used to configure HA for nfs-ganesha service. However these scripts are modelled such a way that they can be used (with some tweaks) to export volumes on a stand-alone machine as well without HA configuration needed.

Assumptions are that a GlusterFS trusted storage pool is set up and a volume has been created to be exported. If you would like to have a more detailed walk through with instructions on setting up either Gluster or NFS-Ganesha, please look at the links of few guides provided in the References section at the end of this post.

I used a machine with fedora24 installed.

#cat /etc/redhat-release 
Fedora release 24 (Twenty Four)

Install nfs-ganesha

On one of the storage pool nodes, install nfs-ganesha-gluster package using the below command

# sudo dnf install nfs-ganesha-gluster

#sudo dnf install nfs-ganesha-gluster
Last metadata expiration check: 0:28:38 ago on Mon Aug 15 23:22:51 2016.
Dependencies resolved.
==================================================================================
 Package                 Arch       Version                     Repository   Size
==================================================================================
Installing:
 jemalloc                x86_64     4.1.0-1.fc24                fedora      180 k
 libntirpc               x86_64     1.4.0-0.3pre3.fc24          updates     124 k
 nfs-ganesha             x86_64     2.4.0-0.14dev27.fc24        updates     599 k
 nfs-ganesha-gluster     x86_64     2.4.0-0.14dev27.fc24        updates      49 k

Transaction Summary
==================================================================================
Install  4 Packages

Total download size: 953 k
Installed size: 2.5 M
Is this ok [y/N]: y
Downloading Packages:
(1/4): nfs-ganesha-gluster-2.4.0-0.14dev27.fc24.x  33 kB/s |  49 kB     00:01    
(2/4): jemalloc-4.1.0-1.fc24.x86_64.rpm           103 kB/s | 180 kB     00:01    
(3/4): libntirpc-1.4.0-0.3pre3.fc24.x86_64.rpm    219 kB/s | 124 kB     00:00    
(4/4): nfs-ganesha-2.4.0-0.14dev27.fc24.x86_64.rp 252 kB/s | 599 kB     00:02    
----------------------------------------------------------------------------------
Total                                             175 kB/s | 953 kB     00:05     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Installing  : libntirpc-1.4.0-0.3pre3.fc24.x86_64                           1/4 
  Installing  : jemalloc-4.1.0-1.fc24.x86_64                                  2/4 
  Installing  : nfs-ganesha-2.4.0-0.14dev27.fc24.x86_64                       3/4 
warning: /etc/sysconfig/ganesha created as /etc/sysconfig/ganesha.rpmnew
  Installing  : nfs-ganesha-gluster-2.4.0-0.14dev27.fc24.x86_64               4/4 
  Verifying   : nfs-ganesha-gluster-2.4.0-0.14dev27.fc24.x86_64               1/4 
  Verifying   : nfs-ganesha-2.4.0-0.14dev27.fc24.x86_64                       2/4 
  Verifying   : jemalloc-4.1.0-1.fc24.x86_64                                  3/4 
  Verifying   : libntirpc-1.4.0-0.3pre3.fc24.x86_64                           4/4 

Installed:
  jemalloc.x86_64 4.1.0-1.fc24                                                    
  libntirpc.x86_64 1.4.0-0.3pre3.fc24                                             
  nfs-ganesha.x86_64 2.4.0-0.14dev27.fc24                                         
  nfs-ganesha-gluster.x86_64 2.4.0-0.14dev27.fc24                                 

Complete!

Stop other NFS services

Stop kernel-NFS and gluster-NFS services if already running on that system.

To stop kernel-NFS:

#sudo systemctl stop nfs

To stop gluster-NFS:

#gluster vol set <volname> nfs.disable ON (Note: this command has to be repeated for all the volumes in the trusted-pool)

Start NFS-Ganesha service

Now start NFS-Ganesha service using the below command –

#sudo systemctl start nfs-ganesha

Verify the status

Verify if the service is successfully started using any of the following ways:

To know the status of the service –

#systemctl status nfs-ganesha

#sudo systemctl status nfs-ganesha
● nfs-ganesha.service - NFS-Ganesha file server
   Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2016-08-15 23:58:05 IST; 2min 37s ago
     Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki
  Process: 14780 ExecStartPost=/bin/bash -c prlimit --pid $MAINPID --nofile=$NOFILE:$NOFILE (code=exited, status=0/SUCCESS)
  Process: 14778 ExecStart=/bin/bash -c ${NUMACTL} ${NUMAOPTS} /usr/bin/ganesha.nfsd ${OPTIONS} ${EPOCH} (code=exited, status=0/SUCCESS)
 Main PID: 14779 (ganesha.nfsd)
    Tasks: 32 (limit: 512)
   CGroup: /system.slice/nfs-ganesha.service
           └─14779 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -E 6319123899418411008

Aug 15 23:58:04 xxx.yyy.com systemd[1]: Starting NFS-Ganesha file server...
Aug 15 23:58:05 xxx.yyy.com systemd[1]: Started NFS-Ganesha file server.

Verify if the ports used by NFS(& side-band services like mountd, nlm etc) are listed in the below command output –

#rpcinfo -p

# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  54009  status
    100024    1   tcp  59721  status
    100003    3   udp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   udp   2049  nfs
    100003    4   tcp   2049  nfs
    100005    1   udp  37090  mountd
    100005    1   tcp  46233  mountd
    100005    3   udp  37090  mountd
    100005    3   tcp  46233  mountd
    100021    4   udp  50560  nlockmgr
    100021    4   tcp  45229  nlockmgr
    100011    1   udp    875  rquotad
    100011    1   tcp    875  rquotad
    100011    2   udp    875  rquotad
    100011    2   tcp    875  rquotad

Verify if the pseudo-path  is exported by the default –

#showmount -e localhost

#showmount -e localhost
Export list for localhost:

Download configuration scripts

Below configuration scripts are packaged to be installed at ‘/usr/libexec/ganesha’ location as part of ‘ganesha-gluster’ rpm. If that package is not installed, download them explictily.

#sudo mkdir -p /usr/libexec/ganesha

#sudo cd /usr/libexec/ganesha

#sudo wget https://raw.githubusercontent.com/gluster/glusterfs/release-3.10/extras/ganesha/scripts/create-export-ganesha.sh

#sudo wget https://raw.githubusercontent.com/gluster/glusterfs/release-3.10/extras/ganesha/scripts/dbus-send.sh

#sudo chmod 755 create-export-ganesha.sh dbus-send.sh

#sudo touch /etc/ganesha/exports/.export_added

#mkdir -p /usr/libexec/ganesha
#cd /usr/libexec/ganesha
#sudo wget https://raw.githubusercontent.com/gluster/glusterfs/master/extras/ganesha/scripts/create-export-ganesha.sh
--2016-08-16 00:15:07--  https://raw.githubusercontent.com/gluster/glusterfs/master/extras/ganesha/scripts/create-export-ganesha.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.56.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.56.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2103 (2.1K) [text/plain]
Saving to: ‘create-export-ganesha.sh’

create-export-ganesha.sh                  100%[=====================================================================================>]   2.05K  --.-KB/s    in 0s      

2016-08-16 00:15:08 (17.2 MB/s) - ‘create-export-ganesha.sh’ saved [2103/2103]

#sudo wget https://raw.githubusercontent.com/gluster/glusterfs/master/extras/ganesha/scripts/dbus-send.sh
--2016-08-16 00:14:16--  https://raw.githubusercontent.com/gluster/glusterfs/master/extras/ganesha/scripts/dbus-send.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.56.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.56.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2928 (2.9K) [text/plain]
Saving to: ‘dbus-send.sh’

dbus-send.sh                              100%[=====================================================================================>]   2.86K  --.-KB/s    in 0s      

2016-08-16 00:14:17 (30.7 MB/s) - ‘dbus-send.sh’ saved [2928/2928]

#sudo chmod 744 create-export-ganesha.sh dbus-send.sh
#sudo touch /etc/ganesha/exports/.export_added

Create export config file for the volume

To create export config file (with default options) for any volume, use the below command –

#sudo /usr/libexec/ganesha/create-export-ganesha.sh /etc/ganesha on <volume_name>

Verify that an export configuration file for that volume is created in ‘/etc/ganesha/exports’ directory. To include any new or modify existing options, edit this file.

#sudo /usr/libexec/ganesha/create-export-ganesha.sh /etc/ganesha on testvol

#cat /etc/ganesha/exports/export.testvol.conf 
# WARNING : Using Gluster CLI will overwrite manual
# changes made to this file. To avoid it, edit the
# file and run ganesha-ha.sh --refresh-config.
EXPORT{
      Export_Id = 2;
      Path = "/testvol";
      FSAL {
           name = GLUSTER;
           hostname="localhost";
          volume="testvol";
           }
      Access_type = RW;
      Disable_ACL = true;
      Squash="No_root_squash";
      Pseudo="/testvol";
      Protocols = "3", "4" ;
      Transports = "UDP","TCP";
      SecType = "sys";
     }

Export the volume via nfs-ganesha

Make sure the above step (creating export configuration file) is done.

To export the volume, use the command –

#sudo /usr/libexec/ganesha/dbus-send.sh /etc/ganesha on <volume_name>

#sudo bash -x /usr/libexec/ganesha/dbus-send.sh /etc/ganesha on testvol

Now verify if the volume is exported using showmount command –

#showmount -e localhost
Export list for localhost:
/testvol (everyone)

To export any other volume, repeat last two steps.

Unexport the volume via nfs-ganesha

To export the volume, use the command –

#sudo /usr/libexec/ganesha/dbus-send.sh /etc/ganesha off <volume_name>

#sudo bash -x /usr/libexec/ganesha/dbus-send.sh /etc/ganesha off testvol

Now verify if the volume is exported using showmount command –

#showmount -e localhost
Export list for localhost:

 

References:

Posted in GlusterFS, NFS-Ganesha | Tagged , , | Leave a comment

GlusterFS: Understanding Upcall infrastructure and cache-invalidation support

GlusterFS, a scale-out storage platform, comprises of distributed file system which follows client-server architectural model. For more details about Glusterfs, please check http://www.gluster.org.

Its the client(glusterfs) which usually initiates an rpc request to the server(glusterfsd). After processing the request, reply is sent to the client as response to the same request. So till now, there was no interface and use-case present for the server to intimate or make a request to the client.

This support is now being added using “Upcall Infrastructure”. It is a generic and extensible framework, used to maintain states in the glusterfsd process for each of the files accessed (including the clients info doing the fops) and send notifications to the respective glusterfs clients incase of any change in that state.

Few of the use-cases (currently identified) of this infrastructure are:

  • Cache Inode Update/Invalidation
  • Recall Delegations/lease locks

More details on the feature and design is documented at

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure

At present, Upcall framework and cache_invalidation support has been added. Lease locks support is still work in progress. This post mainly aims to explain the code flow of the changes done till now (excluding lease_locks)

Patches current under review are –

http://review.gluster.org/#/c/9534/
http://review.gluster.org/#/c/9535/
http://review.gluster.org/#/c/9536/

rpc changes

In GlusterFS, there already exists a rpc routine to send callback requests “rpcsvc_callback_submit (..)”. But this routine takes ‘iovec’ as input. So there is a new wrapper function “rpcsvc_request_submit(..)” added to XDR_ENCODE the binary data and copy it into iovec before making a callback request. We shall use this routine to send upcalls.

Upcall xlator

A new xlator(Upcall) has been defined to maintain and process state of the events which require server to send upcall notifications. For each I/O on a inode, we create/update a ‘upcall_inode_ctx’ and store/update the list of clients’ info ‘upcall_client_t’ in the context. The list is protected by mutex lock.

By default this xlator is OFF. It will be enabled by using a xlator option which will be added later.

libglusterfs

A new xdr struct “gfs3_xdr_upcall_req” defined encapsulates all the data required to pass onto the clients in case of any upcall notfications.

A new notify event has been added “GF_EVENT_UPCALL” which will be used by this new xlator to send upcall event data along with uid of the client to which it has to be sent to the parent xlators.

A new cbk event “GF_CBK_UPCALL” is defined to send/detect upcall events from the server to the client.

protocol/server

protocol/server on receiving this upcall event & data, retrieves the rpc transport connection details of the client based on client_uid passed, by scaning the list of the rpc transport objects it is connected to.

protocol/server makes use of this ‘GF_CBK_UPCALL’ event to send the callback request along the with the upcall event data passed to the client.

protocol/client

protocol/client on receiving this upcall CBK event (‘client3_cbk_upcall(..)’), XDR_DECODES the upcall data passed and notifies the parent xlator.

gfapi

A new list has been added to the ‘glfs’ object to store all the upcall notifications received. gfapi on receiving upcall notification, creates a new upcall_event entry out of the upcall event data passed (in ‘glfs_upcall(..)’) and appends it to the corresponding glfs’ upcall list.

It is left to the applications to write APIs to read and process the events stored in the upcall list.

gfapi/handleops

For NFS-Ganesha, we are currently using polling mechanism to read these upcall events. So NFS-Ganesha makes use of ‘glfs_h_poll_upcall(..)’ API which reads upcall events one-by-one stored in the upcall list and maps it accordingly to the data structures used by Ganesha before sending it.

In future, this may be replaced by callback/notification mechanism to send upcall events to the application.

Cache-Invalidation

Cache-Invalidation is one of the use-cases which makes use of this upcall infrastructure. It is currently required to support multi-head NFS-Ganesha. In the upcall xlator, for each file accessed, in the cbk path , we invoke “upcall_cache_invalidate”, where in we create/update its corresponding ‘upcall_client_entry’ stored in the upcall_inode_ctx. Later we traverse through rest of the clients which recently accessed this file (stored in the same ‘upcall_entry’) and make a notify call with the upcall event type “CACHE_INVALIDATION” to the parent xlator to send upcall notifications to them with the below info

* client_uid (to which upcall notification has to be sent)

* flags (which denote the attributes changed on the file).

Note: this upcall notification is sent to only those clients which have accessed the file recently (i.e, with in CACHE_INVALIDATE_PERIOD – default 60sec). This option will be made tunable in future.

Sequence diagram explaining this case can be found in the below link –

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Sequence_diagram

P.S. These changes are currently under review. Will update the post as and when any changes are done.

Posted in GlusterFS | Tagged , , | 3 Comments