Jan Hutař's blog: 2017

2017-12-25

Monitoring Satellite 5 with PCP (Performance Co-Pilot)

During some performance testing we have done, I have used PCP to monitor basic stats about Red Hat Satellite 5 (could be applied to Spacewalk). I was unable to make it sufficient, but maybe somebody could fix and enhance it. I have taken lots from lzap. First of all, install PCP (PostgreSQL and Apache PMDA lives in RHEL Optional repo as of now, in CentOS7 it seems to be directly in base repo):

subscription-manager repos --enable rhel-6-server-optional-rpms
yum -y install pcp pcp-pmda-postgresql pcp-pmda-apache
subscription-manager repos --disable rhel-6-server-optional-rpms

Now start services:

chkconfig pmcd on
chkconfig pmlogger on
service pmcd restart
service pmlogger restart

Install PostgreSQL and Apache monitoring plugins

cd /var/lib/pcp/pmdas/postgresql
./Install   # select "c(ollector)" when it asks
cd /var/lib/pcp/pmdas/apache
echo -e "<Location /server-status>\n  SetHandler server-status\n  Allow from all\n</Location>\nExtendedStatus On" >>/etc/httpd/conf/httpd.conf
service httpd restart
./Install
# Configure hot proc
cat >/var/lib/pcp/pmdas/proc/hotproc.conf <<EOF
> #pmdahotproc
> Version 1.0
> fname == "java" || fname == "httpd"
> EOF

And because I have Graphite/Grafana setup available, I was pumping selected metrices there (from RHEL6 which is with SysV):

# tail -n 1 /etc/rc.local
pcp2graphite --graphite-host carbon.example.com --prefix "pcp-jhutar." --host localhost - kernel.all.load mem.util.used mem.util.swapCached filesys.full network.interface.out.bytes network.interface.in.bytes disk.dm.read disk.dm.write apache.requests_per_sec apache.bytes_per_sec apache.busy_servers apache.idle_servers postgresql.stat.all_tables.idx_scan postgresql.stat.all_tables.seq_scan postgresql.stat.database.tup_inserted postgresql.stat.database.tup_returned postgresql.stat.database.tup_deleted postgresql.stat.database.tup_fetched postgresql.stat.database.tup_updated filesys.full hotproc.memory.rss &

Problems I had with this

For some reasons I have not investigated closely, after some time PostgreSQL data were not visible in Grafana. Also I was unable to get hotproc data available in Grafana. Also I was experimenting with PCP's emulation of Graphite and its Grafana, but PCP's Graphite lack filters which makes its usage hard and not practical for anything beyond simple stats.

2017-12-22

"Error: Too many open files" when inside Docker container

Does not work: various ulimit settings for daemon

We have container build from this Dockerfile, running RHEL7 with oldish docker-1.10.3-59.el7.x86_64. Containers are started with:

# for i in $( seq 500 ); do
      docker run -h "$( hostname -s )container$i.example.com" -d --tmpfs /tmp --tmpfs /run -v /sys/fs/cgroup:/sys/fs/cgroup:ro --ulimit nofile=10000:10000 r7perfsat
  done

and we have set limits for a docker service on a docker host:

# cat /etc/systemd/system/docker.service.d/limits.conf
[Service]
LimitNOFILE=10485760
LimitNPROC=10485760

but we have still seen issues with "Too many open files" inside the container. It could happen when installing package with yum (resulting into corrupted rpm database, rm -rf /var/lib/rpm/__db.00*; rpm --rebuilddb; saved it though) and when enabling service (our containers have systemd in them on purpose):

# systemctl restart osad
Error: Too many open files
# echo $?
0

Because I was stupid, I have not checked journal (in the container) in the moment when I have spotted the failure for the first time:

Dec 21 10:18:54 b08-h19-r620container247.example.com journalctl[39]: Failed to create inotify watch: Too many open files
Dec 21 10:18:54 b08-h19-r620container247.example.com systemd[1]: systemd-journal-flush.service: main process exited, code=exited, status=1/FAILURE
Dec 21 10:18:54 b08-h19-r620container247.example.com systemd[1]: inotify_init1() failed: Too many open files
Dec 21 10:18:54 b08-h19-r620container247.example.com systemd[1]: inotify_init1() failed: Too many open files

Does work: fs.inotify.max_user_instances

At the end I have ran into some issue and very last comment there had a think I have not seen before. At the end I have ended up with:

# cat /etc/sysctl.d/40-max-user-watches.conf
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576

Default on a different machine is:

# sysctl -a 2>&1 | grep fs.inotify.max_user_
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 8192

Looks like increasing fs.inotify.max_user_instances helped and our containers are stable.

2017-11-04

Working local DNS for your libvirtd guests

Update 2017-12-25: possibly better way: Definitive solution to libvirt guest naming

This is basically just a copy&paste of commands from this great post: [Howto] Automated DNS resolution for KVM/libvirt guests with a local domain and Automatic DNS updates from libvirt guests which already saved me a lots of typing. So with my favorite domain:

Make libvirtd's dnsmasq to act as authoritative nameserver for example.com domain:

# virsh net-dumpxml default
<network>
  <name>default</name>
  <uuid>2ed15952-d1c0-4819-bde5-c8f7278ce3ac</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:a4:40:a7'/>
  <domain name='example.com' localOnly='yes'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>

And restart that network:

# virsh net-edit default   # do the edits here
# virsh net-destroy default
# virsh net-start default

Now configure NetworkManager to start its own dnsmasq which acts like your local caching nameserver and forwards all requests for example.com domain to 192.168.122.1 nameserver (which is libvirtd's dnsmasq):

# cat /etc/NetworkManager/conf.d/localdns.conf
[main]
dns=dnsmasq
# cat /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf
server=/example.com/192.168.122.1

And restart NetworkManager:

# systemctl restart NetworkManager

Now if I have guest with hostname set (check HOSTNAME=... in /etc/sysconfig/network on RHEL6 and below or hostnamectl set-hostname ... on RHEL7) to "satellite.example.com", I can ping it from both virtualization host and another guests on that host by hostname. If you have some old OS release on the guest (like RHEL 6.5 from what I have tried, 6.8 do not need this), set hostname with DHCP_HOSTNAME=... in /etc/sysconfig/network-scripts/ifcfg-eth0 (on the guest) to make this to work.

2017-08-13

Quick Python performance tuning cheat-sheet

Just a few commands without any context:

Profiling with cProfile

This helped me to find slowest functions, because when optimizing, I need to focus on these (best ration of work needed vs. benefits). This helped me to find function which did some unnecessary calsulations over and over again:

$ python -m cProfile -o cProfile-first_try.out ./layout-generate.py ...
$ python -m pstats cProfile-first_try.out 
Welcome to the profile statistics browser.
cProfile-first_try.out% sort
Valid sort keys (unique prefixes are accepted):
cumulative -- cumulative time
module -- file name
ncalls -- call count
pcalls -- primitive call count
file -- file name
line -- line number
name -- function name
calls -- call count
stdname -- standard name
nfl -- name/file/line
filename -- file name
cumtime -- cumulative time
time -- internal time
tottime -- internal time
cProfile-first_try.out% sort tottime
cProfile-first_try.out% stats 10
Sat Aug 12 23:19:40 2017    cProfile-first_try.out

         18508294 function calls (18501563 primitive calls) in 8.369 seconds

   Ordered by: internal time
   List reduced from 2447 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    27837    4.230    0.000    5.015    0.000 ./utils_matrix2layout.py:14(get_distance_matrix_2d)
    10002    1.356    0.000    1.513    0.000 ./utils_matrix2layout.py:244(get_measured_error_2d)
  5674796    0.572    0.000    0.572    0.000 /usr/lib64/python2.7/collections.py:90(__iter__)
  5340664    0.219    0.000    0.219    0.000 {math.sqrt}
  5432768    0.189    0.000    0.189    0.000 {abs}
   230401    0.183    0.000    0.183    0.000 /usr/lib64/python2.7/collections.py:71(__setitem__)
        1    0.178    0.178    0.282    0.282 ./utils_matrix2layout.py:543(count_angles_layout)
    10018    0.119    0.000    0.345    0.000 /usr/lib64/python2.7/_abcoll.py:548(update)
        1    0.102    0.102    6.749    6.749 ./utils_matrix2layout.py:393(iterate_evolution)
     1142    0.092    0.000    0.111    0.000 /usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py:1299(svd)

To explain the columns, Instant User’s Manual says:

tottime
: for the total time spent in the given function (and excluding time made in calls to sub-functions)
cumtime
: is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.

Lets compile to C with Cython

Simply performing this on a module which does most of the work gave me about 20% speedup:

# dnf install python2-Cython
$ cython utils_matrix2layout.py
$ gcc `python2-config --cflags --ldflags` -shared utils_matrix2layout.c -o utils_matrix2layout.so

There is much more to do to optimize it, but that would need additional work, so not now :-) Some helpful links:

use python2-config to get compile and linking options
when you want to create *.so instead of executable, you need to use -shared
bloq with summary and nice FAQ

2017-06-03

Hard times with Ansible's to_datetime filter

I was a bit stupid. Took me some time to figure out how this is supposed to work, so here it is.

In Ansible 2.2 there is a new "to_datetime" filter (see bottom of that section) which transforms datetime string to datetime object.

Basic usage to convert string to datetime object, not that useful in this form:

$ ansible -c local -m debug -a "var=\"'2017-06-01 20:30:40'|to_datetime\"" localhost

localhost | SUCCESS => {
    "'2017-06-01 20:30:40'|to_datetime": "2017-01-06 20:30:40", 
    "changed": false
}

You can parse datetime string with arbitrary format (see python documentation for formatting options):

$ ansible -c local -m debug -a "var=\"'06/01/2017'|to_datetime('%m/%d/%Y')\"" localhost

localhost | SUCCESS => {
    "'06/01/2017'|to_datetime('%m/%d/%Y')": "2017-06-01 00:00:00", 
    "changed": false
}

In my case I wanted to parse start and end date of some registered task in ansible playbook (so in playbook string to parse would be registered_variable.start). Maybe you do not want datetime object, but UNIX timestamp from that (notice extra parenthesis):

$ ansible -c local -m debug -a "var=\"('2017-06-01 20:30:40.123456'|to_datetime('%Y-%m-%d %H:%M:%S.%f')).strftime('%s')\"" localhost

localhost | SUCCESS => {
    "('2017-06-01 20:30:40.123456'|to_datetime('%Y-%m-%d %H:%M:%S.%f')).strftime('%s')": "1496341840", 
    "changed": false
}

But actually I just wanted to know how much time given task took, so I can simply substract two datetime objects and then use .seconds of the resulting timedelta object:

$ ansible -c local -m debug -a "var=\"( '2017-06-01 20:30:40.123456'|to_datetime('%Y-%m-%d %H:%M:%S.%f') - '2017-06-01 20:29:35.234567'|to_datetime('%Y-%m-%d %H:%M:%S.%f') ).seconds\"" localhost

localhost | SUCCESS => {
    "( '2017-06-01 20:30:40.123456'|to_datetime('%Y-%m-%d %H:%M:%S.%f') - '2017-06-01 20:29:35.234567'|to_datetime('%Y-%m-%d %H:%M:%S.%f') ).seconds": "64", 
    "changed": false
}

In pre 2.2 version, you can use this inefficient call of local date command (you do not have to worry about that ugly '\\\"' escaping when in playbook):

$ ansible -c local -m debug -a "var=\"lookup('pipe', 'date -d \\\"2017-06-01 20:30:40.123456\\\" +%s')\"" localhost

localhost | SUCCESS => {
    "changed": false, 
    "lookup('pipe', 'date -d \"2017-06-01 20:30:40.123456\" +%s')": "1496341840"
}

Good luck!

2017-02-02

DNS and "next-server" in DHCP configuration on libvirt's dnsmasq

I was playing with Satellite and re-provisioning client registered to it. This is awkward when you do it remotely and on real hardware - for me it is difficult to setup (if you want DNS and DHCP) and when client fails during re-provisioning, you either have to have physical access to it, or client have to have some kind of remote management console. Using libvirt is, on the other hand, very straightforward and you can get DNS and DHCP for free.

# virsh net-edit --network default
<network>
  <name>default</name>
  <uuid>970b7e2e-88d1-4100-8a2a-8db36c911d4c</uuid>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:f1:e9:9a'/>
  <dns>
    <host ip='192.168.122.46'>
      <hostname>sat-emb.example.com</hostname>
    </host>
    <host ip='192.168.122.170'>
      <hostname>proxy.example.com</hostname>
    </host>
    <host ip='192.168.122.25'>
      <hostname>client.example.com</hostname>
    </host>
  </dns>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
      <bootp file='/pxelinux.0' server='192.168.122.46'/>
    </dhcp>
  </ip>
</network>

<network><dns> configures hostname and their IPs form domain name resolution.

<network><ip><dhcp><bootp> allows me to set server which serves as PXE network boot server and file it clients should request. In my case, 192.168.122.46 is a Satellite with tftp running and configured.

NOTE: I have noticed that guests can not translate outer world hostnames to IPs - it looked like dnsmasq on the virtualization host is not forwarding requests it can not resolve to DNS servers from /etc/resolv.conf. Adding "<dns><forwarder addr="ip.of.another.nameserver" domain="internal.network.com"/>..." and restarting the network did not helped. At the end I have discovered there is forgotten no-resolv option in /var/lib/libvirt/dnsmasq/default.conf. When I have removed it and restarted network to regenerate config, it worked. I have probably forgotten it there in some previous adventures. From dnsmasq manual page:

       -R, --no-resolv
              Don't read /etc/resolv.conf. Get upstream servers
              only from the command line or the dnsmasq
              configuration file.

2017-01-26

Docker 1.10 with Devicemapper direct LVM vs. OverlayFS

During some load testing work we had a big problems with too big disk consumption when on Devicemapper direct LVM - our use-case of very many small (cpu/ram-vise) containers is not good for this storage driver. We have experimented and compared Docker 1.10 (docker-1.10.3-59.el7.x86_64) with Devicemapper direct LVM vs. OverlayFS (with XFS as a "hosting" FS) and there are some random numbers in this post. Docker also have page comparing different storage drivers. Red Hat documentation warns about OverlayFS stability, so we will see how it goes later in real load.

Configuring Docker for Devicemapper direct LVM

You need LVM volume group with free space for this.

systemctl stop docker
:>/etc/sysconfig/docker-storage
rm -rf /var/lib/docker/*
echo "VG='volume_group_with_free_space_for_docker'" >/etc/sysconfig/docker-storage-setup
docker-storage-setup
systemctl start docker

Configuring Docker for OverlayFS

Docker's documentation advises to use separate partition for OverlayFS as this storage driver consumes lots of inodes ("overlay2" should be better for that, but afaict this comes with Docker 1.12). We will see how it goes as we did not do any tunings when creating filesystem there. How to create XFS filesystem for OverlayFS and Changing Storage Configuration and "Overlay Graph Driver" below:

systemctl stop docker
sed -i '/OPTIONS=/s/--selinux-enabled//' /etc/sysconfig/docker
:>/etc/sysconfig/docker-storage
rm -rf /var/lib/docker/*
lvcreate --name docker --extents 100%FREE volume_group_with_free_space_for_docker
mkfs -t xfs -n ftype=1 /dev/volume_group_with_free_space_for_docker/docker
echo "/dev/volume_group_with_free_space_for_docker/docker /var/lib/docker xfs defaults 0 0" >>/etc/fstab
mount /var/lib/docker
echo "STORAGE_DRIVER='overlay'" >/etc/sysconfig/docker-storage-setup
docker-storage-setup
systemctl start docker

Comparing Devicemapper direct LVM vs. OverlayFS

This is the container we are using.

Starting containers

# time for i in $(seq 10); do docker run -h "con$i.example.com" -d r7perfsat; done
Devicemapper direct LVM: 38.832s
OverlayFS: 7.530s

Inspecting container sizes

# time docker inspect --size -f "SizeRw={{.SizeRw}} SizeRootFs={{.SizeRootFs}}" some_container
Devicemapper direct LVM: 18.254s
  SizeRw=2917 SizeRootFs=1633247414
OverlayFS: 2.694s
  SizeRw=2921 SizeRootFs=1633247406

Note: not sure if above is a relevant test case.

Stopping containers

# time docker stop 10_containers_we_have_created_before
Devicemapper direct LVM: 4.888s
OverlayFS: 3.266s

Removing stopped containers

# time docker rm 10_containers_we_have_created_before
Devicemapper direct LVM: 2.289s
OverlayFS: 0.206s

Registering containers

Running ansible playbook which registers via subscription-manager to Satellite 6 and installs Katello Agent and does few quick changes to the container on 5 of these containers in parallel:

Devicemapper direct LVM: 2m54.169s
OverlayFS: 2m45.890s

Note that during this load on the docker host seems much smaller on OverlayFS enable docker host, but that night be because of lots of other containers there are running on the host.

Downgrading few packages in containers

Running ansible playbook which downgrades few most small packages on 5 of these containers in parallel:

Devicemapper direct LVM: 1m34.035s
OverlayFS: 1m1.685s

At first glance, OverlayFS seems faster.

2017-01-06

Changing (growing) xfs partition size which is not on the LVM

XFS is a default file systems type in Red Hat Enterprise Linux 7. It is quite well known to me how to grow file system size when it is on LVM, but when it is not, I was very unsure. Anyway, this worked:

I want to increase root partition size and because the host is actually a virtual machine, it is easy to add more space. So in the VM we are using about 50 GB out of 161 GB available disk space. When using fdisk, guide advises to switch it to "sectors" mode. That seems to be default, but option is there just to be sure. Note the start sector (5222400 in my case) for the /dev/vda3 partition which hosts root file system:

[root@rhevm ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda3        48G   36G   13G  75% /
devtmpfs         11G     0   11G   0% /dev
tmpfs            11G     0   11G   0% /dev/shm
tmpfs            11G  8.4M   11G   1% /run
tmpfs            11G     0   11G   0% /sys/fs/cgroup
/dev/vda1       497M  208M  289M  42% /boot
tmpfs           2.1G     0  2.1G   0% /run/user/0
[root@rhevm ~]# fdisk -u=sectors -l /dev/vda 

Disk /dev/vda: 161.1 GB, 161061273600 bytes, 314572800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000e38b0

   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *        2048     1026047      512000   83  Linux
/dev/vda2         1026048     5222399     2098176   82  Linux swap / Solaris
/dev/vda3         5222400   104857599    49817600   83  Linux

If we try to extend XFS file system now, it wont work because it does not have space for the expansion (ignore the actual numbers, as I have taken them after the expansion shown below):

[root@rhevm ~]# xfs_info /
meta-data=/dev/vda3              isize=256    agcount=9, agsize=3113600 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=26214400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=6081, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@rhevm ~]# xfs_growfs -D 30000000 /
meta-data=/dev/vda3              isize=256    agcount=9, agsize=3113600 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=26214400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=6081, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data size 30000000 too large, maximum is 26214400

So, lets delete the partition and create it again at required size (I want it 100 GB big):

[root@rhevm ~]# fdisk -u=sectors /dev/vda 
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): m   
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   g   create a new empty GPT partition table
   G   create an IRIX (SGI) partition table
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Command (m for help): p

Disk /dev/vda: 161.1 GB, 161061273600 bytes, 314572800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000e38b0

   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *        2048     1026047      512000   83  Linux
/dev/vda2         1026048     5222399     2098176   82  Linux swap / Solaris
/dev/vda3         5222400   104857599    49817600   83  Linux

Command (m for help): d
Partition number (1-3, default 3): 3
Partition 3 is deleted

Command (m for help): n
Partition type:
   p   primary (2 primary, 0 extended, 2 free)
   e   extended
Select (default p): p
Partition number (3,4, default 3): 3
First sector (5222400-314572799, default 5222400): 5222400
Last sector, +sectors or +size{K,M,G} (5222400-314572799, default 314572799): +100G
Partition 3 of type Linux and of size 100 GiB is set

Command (m for help): p

Disk /dev/vda: 161.1 GB, 161061273600 bytes, 314572800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000e38b0

   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *        2048     1026047      512000   83  Linux
/dev/vda2         1026048     5222399     2098176   82  Linux swap / Solaris
/dev/vda3         5222400   214937599   104857600   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.
[root@rhevm ~]# partprobe 
Error: Partition(s) 3 on /dev/vda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
[root@rhevm ~]# shutdown -r now

After reboot, we are finally ready to grow the file system:

[root@rhevm ~]# xfs_growfs /
meta-data=/dev/vda3              isize=256    agcount=4, agsize=3113600 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=12454400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=6081, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 12454400 to 26214400
[root@rhevm ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda3       100G   36G   65G  36% /
devtmpfs         11G     0   11G   0% /dev
tmpfs            11G     0   11G   0% /dev/shm
tmpfs            11G  8.4M   11G   1% /run
tmpfs            11G     0   11G   0% /sys/fs/cgroup
/dev/vda1       497M  208M  289M  42% /boot
tmpfs           2.1G     0  2.1G   0% /run/user/0

2017-01-02

Remember that in Bash each command in pipeline is executed in a subshell

This took me some time to debug recently so wanted to share. Not new or so at all, but good to be reminded that from time to time :-)

$ function aaa() {
>   return 10 | true
> }
$ aaa
$ echo $?
0

Above can surprise a bit (why the heck I'm not getting exit code 10 when that return was executed?) especially when hidden in some bigger chunk of code, but with set -o pipefail I get what I wanted:

$ set -o pipefail
$ aaa
$ echo $?
10