Preparation in the host
create /dev/sdb1
them /mount/kvm
Defining GWMS
Edited IP in Net document.
Edited Cobbler file.
Copied cobbler profile (from itbv-ce-condor)
Puppet is on mgr
virsh start ui-gwms.uchicago.edu
virsh console ui-gwms.uchicago.edu
virsh list
koan --server=uct2-grid1.mwt2.org --virt --system=ui-gwms.uchicago.edu
virsh dumpxml ui-gwms.uchicago.edu | less
iTaP
- Springboard - Joomla portal
- David Braun - head of visualization, moving data
condor_config_val TOOL_DEBUG
[root@ui-gwms ~]# ps -efyww |grep python
[root@ui-gwms ~]# /etc/init.d/frontend_startup reconfig
[root@ui-gwms ~]# condor_config_val GSI_DAEMON_CERT
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any -pool glidein-1.t2.ucsd.edu -constraint "glideinmytype == \"glideclient\"" -format "%s\n" clientname |sort -u
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any -pool glidein-1.t2.ucsd.edu | sort -u
[root@ui-gwms ~]# CONDOR_CONFIG=/usr/share/gwms-frontend/frontend-temp/frontend.condor_config condor_status -any
[root@ui-gwms ~]# grid-proxy-init -cert /etc/grid-security/giwms/gi-vofe-glidein-fe-condor-cert.pem -key /etc/grid-security/giwms/gi-vofe-glidein-fe-condor-key.pem
New VM with KVM on MWT2
Documentation:
Copy profile in cobbler using the
Web interface: copy from a similar existing node and edit changes.
Edit modules/itb in puppet (mgt). Also here you can start from a similar node:
[root@uct2-mgt puppet]# vi /etc/puppet/manifests/nodes.itb.pp
Defines and starts installation (koan). Then check. Once installation completes it is turned off. The VM needs to be restarted.
[root@itb-kvm3 ~]# koan --server=uct2-grid1.mwt2.org --virt --system=ui-cr.uchicago.edu
virsh list
virsh console ui-cr.uchicago.edu
virsh start ui-cr.uchicago.edu
Other commands:
[root@itb-kvm3 ~]# virsh reboot ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh shutdown ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh destroy ui-cr.uchicago.edu
[root@itb-kvm3 ~]# virsh undefine ui-cr.uchicago.edu
[root@itb-kvm3 ~]# rm /mnt/kvm/ui-cr.uchicago.edu-disk0
Then again reinstallation (starts shortly after):
[root@itb-kvm3 ~]# koan --server=uct2-grid1.mwt2.org --virt --system=ui-cr.uchicago.edu
URLs to download Condor:
and CF:
On pads
- create "cf" off my home dir
- download stripped verison
follow instructions:
- names are different form Condor conventions
Different blah path:
/home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/libexec/glite/etc/batch_gahp.config
blah configuration: suspect mising (e.g. condor path)
No instruction to start Condor.
Condor was not really installed (install script)
No instruction to create spool, execute and log
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/log
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/spool
-bash-3.2$ mkdir /home/marco/cf/condor-7.5.6-x86_64_rhap_5-stripped/local.login2/execute
Change execute permissions?
condor_q is not working
172.5.86.6 - address not found
MWT2 - To fix ui-cr, I need to disable IPV6 in grub and reboot (else it is mounting as nobody the shared fs)
ipv6.disable=1
at the end of the line on
/etc/grub.conf
(
/boot/grub/grub.conf
)
On submit host (ui-gwms, as marco, where condor already installed)
starting from
https://twiki.grid.iu.edu/bin/view/Tier3/CondorSharedInstall
./condor_install --prefix=/opt/marco/condor --type=submit
then modified local config file (ALLOW_WRITE)
Configuration:
- FLOCK_TO - This must be set to include the machine running the campus factory.
- SEC_DEFAULT_AUTHENTICATION_METHODS - It must include CLAIMTOBE. A typical value is FS,CLAIMTOBE.
- SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION - Must be set to TRUE.
- FLOCK_NEGOTIATOR_HOSTS
- FLOCK_COLLECTOR_HOSTS
Create
condor_mapfile
04/04/11 16:44:07 PERMISSION DENIED to
marco@pads.ci.uchicago.edu from host 172.5.86.6 for command 453 (RESTART), access level ADMINISTRATOR: reason: ADMINISTRATOR authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 172.5.86.6,login2-172.pads.ci.uchicago.edu,login2-172
Firewall problem
New install of CF on ITB4
mkdir ~/cf/
install of condor from ui-gwms
-bash-3.2$ cd condor-7.5.6-x86_64_rhap_5-stripped/
bin/ DOC include/ libexec/ README
condor_configure etc/ INSTALL LICENSE-2.0.txt sbin/
condor_install examples/ lib/ man/ src/
-bash-3.2$ cd condor-7.5.6-x86_64_rhap_5-stripped/
-bash-3.2$ ./condor_install --prefix=/home/marco/cf/condor
Installing Condor from /tmp/condor-src/condor-7.5.6-x86_64_rhap_5-stripped to /share/home/marco/cf/condor-7.5.6
Unable to find a valid Java installation
Java Universe will not work properly until the JAVA
(and JAVA_MAXHEAP_ARGUMENT) parameters are set in the configuration file!
Condor has been installed into:
/share/home/marco/cf/condor-7.5.6
Configured condor using these configuration files:
global: /share/home/marco/cf/condor-7.5.6/etc/condor_config
local: /share/home/marco/cf/condor-7.5.6/local.ui-gwms/condor_config.local
In order for Condor to work properly you must set your CONDOR_CONFIG environment variable to point to your Condor configuration file:
/share/home/marco/cf/condor-7.5.6/etc/condor_config
before running Condor commands/daemons.
Created scripts which can be sourced by users to setup their Condor environment variables. These are:
sh: /share/home/marco/cf/condor-7.5.6/condor.sh
csh: /share/home/marco/cf/condor-7.5.6/condor.csh
Troubleshooting
puppet did not apply the rules:
cannot login as user
- all /home dirs are mounted as nobody
- remember to disable IPV6 in /etc/sysconfig/network (don't know why but it seems to interfere with the ability to mount home dirs)
KVM migration
References:
Checking from the CF
Connection to ui-gwms.uchicago.edu closed.
-bash-3.2$ condor_q -name ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name ui-gwms.uchicago.edu
Error: Collector has no record of schedd/submitter
-bash-3.2$ condor_q -name ui-gwms.uchicago.edu
-- Schedd: ui-gwms.uchicago.edu : <128.135.158.145:37333>
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
4.0 marco 4/8 18:34 0+00:00:00 I 0 0.0 hostname
5.0 marco 4/12 13:50 0+00:00:00 I 0 0.0 hostname
7.0 marco 4/14 16:30 0+00:00:00 I 0 0.0 hostname
8.0 marco 4/14 16:30 0+00:00:00 I 0 0.0 hostname
4 jobs; 4 idle, 0 running, 0 held
-bash-3.2$ condor_q
Recreating a VM (Destroy + Create)
Remember to both destroy and undefine the machine, else koan will complain that the name is already in use.
virsh destroy ui-gwms.uchicago.edu
rm /mnt/kvm/ui-gwms.uchicago.edu-disk0
virsh undefine ui-gwms.uchicago.edu
koan --server=bootstrap.mwt2.org --virt --system=ui-gwms.uchicago.edu
Accessing the disk of an inactive VM
There are tools like
guestfish and
http://libguestfs.org/.
Here's a better recipe:
# to see the first free loopback device
losetup -f
losetup -v /dev/loop1 disk_file.img
kpartx -a /dev/loop1
ls -l /dev/mapper/loop1*
mount -oro /dev/mapper/loop1p1 /mnt
# to unmount the device
umount /mnt
kpartx -d /dev/loop1
losetup -d /dev/loop1
kpartx tells the device mapper to make devices in
/dev/mapper
corresponding to what's in loop1's partition table.
You have to kpartx -d in order to be able to losetup -d and free the
loopback of the file.
Installation of uc3-test
Installation of a new VM uc3-test.uchicago.edu, dual homed, private and public network.
Problems encountered:
- NIC with public IP not working (not configured at boot)
- set hostname to uc3-test.uchicago.edu (/etc/sysconfig/networks, hostname NAME - before it was the mwt2.org name)
- configure the public interface /etc/sysconfig/network-scripts/ifcfg-eth1
- After this both NIC can be reached from the inside network
- Host not reachable form the outside (e.g. from hep or laptop)
- set
GATEWAY=128.135.158.241
(public IP of the gateway uct2-6509) in /etc/sysconfig/network
- noone knows why but it works
Things to verify when there is time:
- That the routing is correct and internal hosts are reached directly without going to the outside network
References
https://help.ubuntu.com/community/KVM/Managing
http://doc.opensuse.org/products/opensuse/openSUSE/opensuse-kvm/cha.libvirt.admin.html
http://wiki.libvirt.org/page/Networking#Bridged_networking_.28aka_.22shared_physical_device.22.29
http://www.linux-kvm.org/page/Main_Page
--
MarcoMambelli - 01 May 2011