Preparation in the host

create /dev/sdb1 
them /mount/kvm

Defining GWMS

Edited IP in Net document. Edited Cobbler file.

Copied cobbler profile (from itbv-ce-condor)

Puppet is on mgr
virsh start ui-gwms.uchicago.edu
virsh console ui-gwms.uchicago.edu
virsh list
koan --server=uct2-grid1.mwt2.org --virt --system=ui-gwms.uchicago.edu 
virsh dumpxml ui-gwms.uchicago.edu | less

iTaP
  • Springboard - Joomla portal
  • David Braun - head of visualization, moving data

condor_config_val TOOL_DEBUG
[root@ui-gwms ~]# ps -efyww |grep python
[root@ui-gwms ~]# /etc/init.d/frontend_startup reconfig 
[root@ui-gwms ~]# condor_config_val GSI_DAEMON_CERT
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any -pool glidein-1.t2.ucsd.edu -constraint "glideinmytype == \"glideclient\"" -format "%s\n" clientname |sort -u
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any -pool glidein-1.t2.ucsd.edu | sort -u
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any
[root@ui-gwms ~]# grid-proxy-init -cert /etc/grid-security/giwms/gi-vofe-glidein-fe-condor-cert.pem -key /etc/grid-security/giwms/gi-vofe-glidein-fe-condor-key.pem 

New VM with KVM on MWT2

Documentation:

Copy profile in cobbler using the Web interface: copy from a similar existing node and edit changes.

Edit modules/itb in puppet (mgt). Also here you can start from a similar node:
[root@uct2-mgt puppet]# vi /etc/puppet/manifests/nodes.itb.pp

Defines and starts installation (koan). Then check. Once installation completes it is turned off. The VM needs to be restarted.
[root@itb-kvm3 ~]# koan --server=uct2-grid1.mwt2.org --virt --system=ui-cr.uchicago.edu

virsh list
virsh console ui-cr.uchicago.edu

virsh start ui-cr.uchicago.edu

Other commands:
[root@itb-kvm3 ~]# virsh reboot ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh shutdown ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh destroy ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh undefine ui-cr.uchicago.edu
[root@itb-kvm3 ~]# rm /mnt/kvm/ui-cr.uchicago.edu-disk0 
Then again reinstallation (starts shortly after):
[root@itb-kvm3 ~]# koan --server=uct2-grid1.mwt2.org --virt --system=ui-cr.uchicago.edu

URLs to download Condor: and CF:

On pads
  • create "cf" off my home dir
  • download stripped verison
follow instructions:
  • names are different form Condor conventions

Different blah path: /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/libexec/glite/etc/batch_gahp.config blah configuration: suspect mising (e.g. condor path)

No instruction to start Condor. Condor was not really installed (install script)

No instruction to create spool, execute and log
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/log
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/spool
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/execute

Change execute permissions?
condor_q is not working
172.5.86.6 - address not found

MWT2 - To fix ui-cr, I need to disable IPV6 in grub and reboot (else it is mounting as nobody the shared fs) ipv6.disable=1 at the end of the line on /etc/grub.conf (/boot/grub/grub.conf)

On submit host (ui-gwms, as marco, where condor already installed) starting from https://twiki.grid.iu.edu/bin/view/Tier3/CondorSharedInstall ./condor_install --prefix=/opt/marco/condor --type=submit then modified local config file (ALLOW_WRITE)

Configuration:
  • FLOCK_TO - This must be set to include the machine running the campus factory.
  • SEC_DEFAULT_AUTHENTICATION_METHODS - It must include CLAIMTOBE. A typical value is FS,CLAIMTOBE.
  • SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION - Must be set to TRUE.
  • FLOCK_NEGOTIATOR_HOSTS
  • FLOCK_COLLECTOR_HOSTS
Create
condor_mapfile

04/04/11 16:44:07 PERMISSION DENIED to marco@pads.ci.uchicago.edu from host 172.5.86.6 for command 453 (RESTART), access level ADMINISTRATOR: reason: ADMINISTRATOR authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 172.5.86.6,login2-172.pads.ci.uchicago.edu,login2-172

Firewall problem

New install of CF on ITB4 mkdir ~/cf/ install of condor from ui-gwms -bash-3.2$ cd condor-7.5.6-x86_64_rhap_5-stripped/ bin/ DOC include/ libexec/ README condor_configure etc/ INSTALL LICENSE-2.0.txt sbin/ condor_install examples/ lib/ man/ src/ -bash-3.2$ cd condor-7.5.6-x86_64_rhap_5-stripped/ -bash-3.2$ ./condor_install --prefix=/home/marco/cf/condor Installing Condor from /tmp/condor-src/condor-7.5.6-x86_64_rhap_5-stripped to /share/home/marco/cf/condor-7.5.6 Unable to find a valid Java installation Java Universe will not work properly until the JAVA (and JAVA_MAXHEAP_ARGUMENT) parameters are set in the configuration file!

Condor has been installed into:
    /share/home/marco/cf/condor-7.5.6
Configured condor using these configuration files:
  global: /share/home/marco/cf/condor-7.5.6/etc/condor_config
  local:  /share/home/marco/cf/condor-7.5.6/local.ui-gwms/condor_config.local
In order for Condor to work properly you must set your CONDOR_CONFIG environment variable to point to your Condor configuration file: /share/home/marco/cf/condor-7.5.6/etc/condor_config before running Condor commands/daemons. Created scripts which can be sourced by users to setup their Condor environment variables. These are:
   sh: /share/home/marco/cf/condor-7.5.6/condor.sh
  csh: /share/home/marco/cf/condor-7.5.6/condor.csh

Troubleshooting

puppet did not apply the rules:
  • puppet may have network problems
    puppetd --test -v
    err: Could not request certificate: Network is unreachable - connect(2)
    Exiting; failed to retrieve certificate and waitforcert is disabled
    
  • check routes
  • set the hostname in cobler but not the gateway (at least not for mwt2.org host)

cannot login as user
  • all /home dirs are mounted as nobody
  • remember to disable IPV6 in /etc/sysconfig/network (don't know why but it seems to interfere with the ability to mount home dirs)

KVM migration

References:

Checking from the CF

Connection to ui-gwms.uchicago.edu closed.
-bash-3.2$ condor_q -name  ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name  ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name  ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name  ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name  ui-gwms.uchicago.edu


-- Schedd: ui-gwms.uchicago.edu : <128.135.158.145:37333>
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   4.0   marco           4/8  18:34   0+00:00:00 I  0   0.0  hostname          
   5.0   marco           4/12 13:50   0+00:00:00 I  0   0.0  hostname          
   7.0   marco           4/14 16:30   0+00:00:00 I  0   0.0  hostname          
   8.0   marco           4/14 16:30   0+00:00:00 I  0   0.0  hostname          

4 jobs; 4 idle, 0 running, 0 held
-bash-3.2$ condor_q

Recreating a VM (Destroy + Create)

Remember to both destroy and undefine the machine, else koan will complain that the name is already in use.
virsh destroy ui-gwms.uchicago.edu
rm /mnt/kvm/ui-gwms.uchicago.edu-disk0
virsh undefine ui-gwms.uchicago.edu
koan --server=bootstrap.mwt2.org --virt --system=ui-gwms.uchicago.edu

Accessing the disk of an inactive VM

There are tools like guestfish and http://libguestfs.org/. Here's a better recipe:
# to see the first free loopback device
losetup -f

losetup -v /dev/loop1 disk_file.img
kpartx -a /dev/loop1
ls -l /dev/mapper/loop1*
mount -oro /dev/mapper/loop1p1 /mnt    

# to unmount the device
umount /mnt
kpartx -d /dev/loop1
losetup -d /dev/loop1

kpartx tells the device mapper to make devices in /dev/mapper corresponding to what's in loop1's partition table.

You have to kpartx -d in order to be able to losetup -d and free the loopback of the file.

Installation of uc3-test

Installation of a new VM uc3-test.uchicago.edu, dual homed, private and public network.

Problems encountered:
  • NIC with public IP not working (not configured at boot)
    • set hostname to uc3-test.uchicago.edu (/etc/sysconfig/networks, hostname NAME - before it was the mwt2.org name)
    • configure the public interface /etc/sysconfig/network-scripts/ifcfg-eth1
    • After this both NIC can be reached from the inside network
  • Host not reachable form the outside (e.g. from hep or laptop)
    • set GATEWAY=128.135.158.241 (public IP of the gateway uct2-6509) in /etc/sysconfig/network
    • noone knows why but it works

Things to verify when there is time:
  • That the routing is correct and internal hosts are reached directly without going to the outside network

References

https://help.ubuntu.com/community/KVM/Managing http://doc.opensuse.org/products/opensuse/openSUSE/opensuse-kvm/cha.libvirt.admin.html http://wiki.libvirt.org/page/Networking#Bridged_networking_.28aka_.22shared_physical_device.22.29 http://www.linux-kvm.org/page/Main_Page

-- MarcoMambelli - 01 May 2011
Topic revision: r10 - 11 Mar 2013, MarcoMambelli
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback