B.6. Release 5.4 - changes from 5.3

B.6.1. New Features

B.6.2. Enhancements

OS: Based on CentOS release 5/update 5 and all updates as of November 2, 2010.

Base: Anaconda installer updated to v11.1.2.209.

Base : no longer remap the private network to "eth0", instead Rocks keeps track of the network a node kickstarted from and maps that network to the "private" network. For example, if a node kickstarted off "eth1", then "eth1" will be mapped to the private network.

Base : hardened the Anaconda installer to more aggressively write the grub configuration files onto the boot disk. This helps to mitigate the "hang while trying to load Grub stage2" issue.

Base : removed ext4 kernel module from installation environment. We found that trying to mount a swap partition as an ext4 file system frequently caused kernel panics during installations.

Base : added ksdevice=bootif to all the PXE boot targets. This improves installation speed by reusing the IP address/interface information when a node PXE boots. Previously, a node would re-scan all ethernet interfaces.

Base : when a node XML file has a syntax error, "rocks list host profile" prints out the name of the node XML file and the line number where the syntax error occurred.

Base : "rocks run host" now spawns multiple parallel threads when multiple hosts are supplied. Also added the following parameters: timeout (thanks Tim Carlson!), delay, stats, collate and num-threads.

Base : yum configuration default modified to bind to the frontend's public IP instead of the private. This facilitates easy package installation for external nodes (e.g., nodes running on a public cloud).

Base : non-existent attributes are considered to be false conditionals when building configuration files.

Base : "precedes" method added for Rocks command plugins to enable fine-grained ordering of plugin execution.

Base : network interfaces under Linux support 2 new specific modes: "dhcp" and "noreport". The "dhcp" mode indicates that the interface should always DHCP to get its address. The "noreport" mode specifies that no "ifcfg-*" file should be written for the interface. If a mode is not specified for an interface, then Rocks will create an "ifcfg-*" file for the interface based on values set in the database (just like it did in the previous release).

Base : IPMI now uses the interface channel column in the networks table to specify the baseboard controller channel number.

Base : text inside "changelog" tags is now wrapped in CDATA to allow XML escape characters. This is only supported for node XML files found within Rolls (not for node XML files found under /export/rocks/install/site-profiles.

Base : rolls can be built without a complete copy of the Rocks source code. They use the Rocks development environment found under /opt/rocks/share/devel on a frontend.

Area51: tripwire updated to v2.4.2.

Bio: refreshed CPAN modules.

Bio: refreshed CPAN MPI-Blast.

Bio: added Celera Whole Genome Sequence Assembler.

Condor: updated to v7.4.4.

Condor: automated Condor configuration completely retooled: 1) the configuration is Rocks command based instead of standalone CondorConf tool, 2) it supports dynamic update of any/all configurations on nodes, 3) it uses Rocks command plugins to allow additional automated condor config (e.g., via plugin, it can turn on MPI support).

Condor: supports a pool password (shared secret) for additional host verification.

Condor: integrates with EC2 roll to extend Condor pools with EC2 Hosts.

Condor: support added for port ranges to facilitate firewall configuration.

Condor: local copy of Condor's manpages added to roll documents.

Condor: support for updating Condor on nodes without re-installation (e.g., rocks run host "yum update condor" ; rocks sync host condor).

Ganglia: monitor-core updated to v3.1.7.

Ganglia: rrdtool updated to v1.4.4.

Ganglia: the Ganglia Roll can now be added on-the-fly to an existing frontend.

Ganglia: all nodes send out their metric metadata every 3 minutes. In the past, when gmond was restarted on the frontend, it couldn't collect metrics from the nodes because it had no metadata from the nodes (and it didn't have a way to ask the nodes because the nodes are configured in "deaf" mode).

HPC: iozone updated to v3.347.

HPC: iperf updated to v2.0.5.

HPC: MPICH2 updated to v1.2.1p1.

HPC: OpenMPI updated to v1.4.3.

HPC: rocks-openmpi is the default MPI and it is configured with mpi-selector.

SGE: SGE updated to V62u5.

SGE: any host can be configured to be an execution host by setting the host's "exec_host" and "sge" attributes to true and any host can become a submission host by setting the host's "submit_host" and "sge" attributes to true.

Web-server: mediawiki updated to v1.16.0.

Web-server: wordpress updated to v3.0.1.

Xen: any node can how host Xen virtual machines. This is controlled with the "xen" attribute.

Xen: set the power for all nodes in a virtual cluster (except the VM frontend) with one command ("rocks set cluster power ..."). Power settings can be "on", "off" or "install" (turn on and force installation).

Xen: allow virtual machines to define VLAN tagged interfaces. Previously, VLAN tagging was only supported for physical interfaces.

B.6.3. Bug Fixes

Base: non-root users can no longer see the encrypted passwords with 'rocks list host attr'. Hashed passwords are now stored in a 'shadow' column in the attribute tables.

Base: the "%" in "rocks run host %" now returns all hosts. Thanks to Tom Rockwell for the fix.

Base: If an ethernet switch sends out a DHCP request, the DHCP server no longer sends it the "filename" and "next server" in the DHCP response. This caused some switches not to properly load their firmware. More generally, this is controlled by the "kickstartable", "dhcp_filename" and "dhcp_nextserver" attributes.

Base: "rocks set password" asks the user to confirm their new password.

Base: when a node requests a kickstart file and if the frontend determines that the frontend is too "busy", the kickstarting node now correctly does a random backoff before re-requesting its kickstart file. Prior to this fix, a node would backoff for 30 seconds.

Base: multiple conditionals can now be present in XML tags.

Base: fixed a graph traversal issue. In the past, if you had the graph "a" (cond) to "b" to "c" and if "cond" was false, the graph traversal would include "a" and "c". Now it just includes "a".

Base: permissions set in the "file" tag are preserved even if there are other "file" tags for the same file that don't set the file's permissions. The bug was when a later "file" tag without a "perms" attribute was encountered, the file's permissions were cleared.

Base: "file" tags now support "os" conditionals.

Base: in insert-ethers, appliances that are marked "not kickstartable" will not have to wait for a kickstart file. In the past, one had to hit the "F9" (force quit) key to exit insert-ethers when discovering non kickstartable appliances (e.g., ethernet switches).

Base: IPMI configuration cleaned up. Rocks no longer generates erroneous entries in modprobe.conf or /etc/sysconfig/ifcfg-ipmi.

Base: The "pre" tag now supports the "interpreter=" attribute.

Bio: eliminated "Permission Denied" errors during multiple runs on the same BLAST database by different users.

SGE: made the job collection metric more efficient. Previously, when 100's of jobs are submitted to a frontend's queue, the SGE metric would take so long to execute, it caused gmond to stop gathering metrics for all hosts.

SGE: the number of CPUs array jobs consume are now correctly counted.