Log in


Forgot your password?
Register
 
Personal tools
You are here: Home Products Platform LSF Family RTM FAQ
Document Actions

FAQ

last modified 2008-07-04 13:16
  1. How to Add clusters via GUI(Web)?
  2. How to Add clusters via command line(script)?
  3. How to mount NFS?
  4. What services are turned off by default?
  5. What are the services that are turned ON by default?
  6. What data being backed up & restore ?
  7. How to change DHCP assigned IP to static IP ?
  8. How to change hostname?
  9. How to add rtm host as a lsf client ?
  10. Issue regarding set_rtc_mms
  11. Port needed by RTM
  12. "Diminished" status of cluster
  13. How to ADD all devices using grid_add_cluster.php ?


How to Add clusters via GUI(Web)?

  • Navigate to the Main console tab.
  • Click on Clusters which can be found on the left panel.
  • Click on “Add” which can be found on the top right just below the “Logout” link
  • Fill in the empty forms as described.
  • Choose the correct poller for the specific LSF cluster.
  • Click on “create” button found at the bottom right of the page.

The newly added cluster will be shown in the cluster list. Remember to add the RTM client to the LSF host.

How to Add clusters via command line(script)?

Certain parameters must be included when using the script.

  • type which determines what type you are adding ie clusters, or device
    --type = 0 to add new cluster	
    --type = 1 to add new device
    

  • cluster_name which is use to identify the cluster.
  • cluster_poller which is use to poll the specific type of LSF cluster
    --cluster_poller=1 is use for LSF 6.2
    --cluster_poller=2 is use for LSF7.01
    --cluster_poller=3 is use for LSF7.02
    --cluster_poller=4 is use for LSF7.03

  • cluster_lsf_envdir is used for specifying the lsf.conf (ego.conf) for the clusters.
  • To add cluster using grid_add_cluster.php script, open a terminal and navigate to the following folder.
    # cd /opt/cacti/plugins/grid
    

  • After navigating to that folder as describe in the first step, type
    # php grid_add_cluster.php –type=0 –cluster_name=testing –cluster_poller=1 –cluster_lsf_envdir=/opt/rtm/etc/test
    

  • Hit “Enter” after you are done with it and the cluster will be added.
  • Verify by viewing it in the browser and make sure the status is up.

How to mount NFS?

Unnecessary services are turn off by default. Therefore in order to mount a nfs directory, we need to turn on a couple of services.

  • To start sharing nfs directory, open a terminal and follow the following steps.
    # service portmap start
    # service nfs start
  • To turn on these two services by default upon boot, please do the following steps.
    # chkconfig --list portmap. This will show the status of the portmap.
  • Do the following to turn it on upon boot.
    # chkconfig portmap on
  • Do the same for nfs.
    #chkconfig nfs on

What services are turned off by default?

NetworkManager  			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
NetworkManagerDispatcher  		        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
acpid					        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
anacron         			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
apmd            			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
atd             			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
autofs          			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
avahi-daemon    			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
avahi-dnsconfd  			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
bluetooth      	 			0:off   1:off   2:off   3:off   4:off   5:off   6:off 
conman          			        0:off   1:off   2:off   3:off   4:off   5:off   6:off 
cpuspeed       				0:off	1:on	2:off	3:off	4:off	5:off	6:off 
cups           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
dc_client      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
dc_server      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
dhcdbd         				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
dund           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
firstboot      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
gpm            				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
haldaemon      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
hidd           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
ibmasm         				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
ip6tables      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
ipmi           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
iptables       				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
irda           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
irqbalance     				0:off	1:off	2:off	3:off	4:off	5:off	6:off  
kudzu          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
lm_sensors     				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
mcstrans       				0:off	1:off	2:off	3:off	4:off	5:off	6:off  
mdmonitor      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
mdmpd          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
messagebus     				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
microcode_ctl  				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
netfs          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
netplugd       				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
nfs            				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
nfslock        				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
nscd           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
oddjobd        				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
pand           				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
pcscd          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
portmap        				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
psacct         				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
rdisc          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
readahead_early				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
restorecond    				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
rpcgssd        				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
rpcidmapd      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
rpcsvcgssd     				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
saslauthd      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
sendmail       				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
smartd         				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
snmpd          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
snmptrapd      				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
squid          				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
tux            				0:off	1:off	2:off	3:off	4:off	5:off	6:off  
winbind        				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
wpa_supplicant 				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
ypbind         				0:off	1:off	2:off	3:off	4:off	5:off	6:off 
yum-updatesd   				0:off	1:off	2:off	3:off	4:off	5:off	6:off

What are the services that are turned ON by default?

advocate        			        0:off   1:off   2:on    3:on    4:on    5:on    6:off 
auditd          			        0:off   1:off   2:on    3:on    4:on    5:on    6:off 
crond          				0:off	1:off	2:on	3:on	4:on	5:on	6:off  
httpd          				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
lmgrd          				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
lsfpollerd     				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
mysqld         				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
network        				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
ntpd           				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
readahead_later				0:off	1:off	2:off	3:off	4:off	5:on	6:off
rtm            				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
sshd           				0:off	1:off	2:on	3:on	4:on	5:on	6:off 
syslog         				0:off	1:off	2:on	3:on	4:on	5:on	6:off

What data being backed up & restore ?

  • List of files that are being backup
    rtm.lic			License file
    lsfpollerd.conf		Database file which contains the credentials
    lsf.conf/ego.conf	        The lsf.conf and ego.conf files that are associated with each added cluster.
    
  • All tables are in the cacti database are being backup except the following
    grid_jobs 
    grid_jobs_rusage 
    grid_job_interval_stats
    poller_output
    poller_output_boost
    
  • List of files that are being restored
    lsf.conf/ego.conf file are being regenerated instead restored due to the hosts file implication.
    

Note that rtm.lic is not being restored even though its being backup.

How to change DHCP assigned IP to static IP ?

To use a static IP, please do the following steps.

  • Open a terminal and type as follows:
    vim /etc/sysconfig/network-scripts/ifcfg-eth0
    
  • The following will be displayed.
    DEVICE=eth0 
    BOOTPROTO=dhcp 
    DHCPCLASS= 
    HWADDR=xx:xx:xx:xx:xx:xx
    ONBOOT=yes 
    DHCP_HOSTNAME=cacti 
  • Comment out the following entries by adding a “#” to the front(without the qoutes)
    #BOOTPROTO
    #DHCPCLASS
    
  • Add in the following with the appropriate entries
    IPADDR=xxx.xxx.xxx.xxx
    NETMASK=xxx.xxx.xxx.xxx
    DNS=xxx.xxx.xxx.xxx
    GATEWAY=xxx.xxx.xxx.xxx
    
  • Save and exit vim editor

How to change hostname?

The default hostname for rtm is cacti but it can be changed.

  • Open a terminal and type the following.
    # vim /etc/sysconfig/network
    
  • The following will be displayed.
    NETWORKING=yes 
    NETWORKING_IPV6=yes 
    HOSTNAME=cacti
    
  • Change the hostname accordingly and save/reboot.
  • New hostname will be reflected accordingly upon startup.
  • The new hostname will also be updated in the hosts file.

How to add rtm host as a lsf client ?

After adding the lsf cluster to RTM, we also need to add RTM to the lsf cluster as a lsf client.

  • Open a terminal on the lsf master and do the following.
    # vim /etc/hosts
    
  • Enter the ipaddress and hostname of the RTM host. Save and exit vim editor.
    # vim //conf/lsf.cluster.
    
  • Add additional entries to the following section of the flat file.
    Begin   Host 
    HOSTNAME  	model    type        server r1m  mem  swp  RESOURCES    #Keywords 
     	!	  !	     0	    3.5   ()        ()      ()            	
    End     Host 
    
  • Save and exit vim editor.
  • Restart the lsf cluster.
  • To verify that the client has been added successfully, open a terminal on the RTM host machine and type the following.
    # telnet  6879/7869 (6.2 cluster/7 cluster)
    
  • If everything is done properly, user should be able to see the following
    Trying <ip_add_of_lsf_cluster>... 
    Connected to <hostname_of_cluster> (<ip_add_of_lsf_cluster>). 
    Escape character is '^]'.
     

Issue regarding set_rtc_mms

This issue is with regard to user trying the iso on vmware. VMWare doesn't give your guest OS a steady clock, so it can't work well with NTP. When NTP sees that your system is more than 10 sec out-of-sync, it figures something bad happened and it terminates quietly. As a result, set_rtc_mms messages will appear on the console regularly.

Port needed by RTM

Currently RTM does not need to know about the MBD(Master Batch Daemon) , SBD(Slave Batch Daemon) and RES(Remote Execution Server) port. But the LIM (Load Information Manager) Port is very critical for RTM to function properly. Without specifying that particular port, RTM will not be able to communicate with the LSF cluster.

"Diminished" status of cluster

Sometimes user will encounter this problem whereby RTM is collecting jobs information but status of the cluster machine is "Diminished". One of the main reasons is due to name resolution.

  • Open a terminal on the lsf master and the RTM machine and do the following.
    # vim /etc/hosts
    
  • Make sure all the hostnames are correct and and be resolved from both machine.
    # ping <hostname_of_RTM/LSF>  
  • Restart the LSF cluster if changes have been made.

Status should change from "Diminished" to "Up"

How to ADD all devices using grid_add_cluster.php ?

  • Open a terminal and navigate to
    # cd /opt/cacti/plugins/grid/
    
  • Type the following command to use the grid_add_clusters script to add the devices
    # php -q grid_add_cluster.php --type=1 --clusterid=all --template=14
    
    • type=1 refers to adding device
    • clusterid=all refers to all the clusters that are in the cluster list
    • template=14 refers to Grid Host Template

This script with the necessary parameters will actually add all the host in the cluster list to the device list. On top of it, it will create the graphs using the Grid Host template.