CondorCycleSeeder
The name chosen for the pool (CONDOR_HOST) is
uct3-mgt.mwt2.org
.
This implies that by default the collector/negotiator will advertise that hostname (not resolvable outside) and go out with the private IP (10.1.3.95) not routable outside.
This is not a problem at the moment since all the hosts intended to flock into the cluster are inside that 10.x.x.x subnet.
To change that see the section XXX
uc3-mgt install & config
manager configurations
Files:
-
/etc/condor/condor_config
is the one configured for the MWT2
- Configuration directory for special custom configurations is
/etc/condor/config.d
, e.g. mathematica
- Local customizations are in
/etc/condor/condor_config.local
- Temporary per-note customizations can be made in
/etc/condor/condor_config.override
Notes on the setup:
- condor host is
uc3-mgt.mwt2.org
- domain is
osg-gk.mwt2.org
to allow MWT2 jobs to run using the same accounts as on MWT2
-
SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION = TRUE
should not be needed but not hurt either
- Local schedd (only for test)
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, SCHEDD
- Authentications:
ALLOW_WRITE = $(ALLOW_WRITE), *.mwt2.org, *.uchicago.edu, uct2-*.uchicago.edu, iut2-*.iu.edu
INTERNAL_IPS = 10.1.3.* 10.1.4.* 10.1.5 128.135.158.241 uct2-6509.uchicago.edu
- Flocking:
FLOCK_FROM = uc3-cloud.uchicago.edu, uc3-cloud.mwt2.org, uc3-sub.uchicago.edu, uc3-sub.mwt2.org, ui-cr.uchicago.edu, ui-cr.mwt2.org, millikan.uchicago.edu, osg-gk.mwt2.org, condor.mwt2.org
- Condor view:
CONDOR_VIEW_HOST = uc3-cloud.uchicago.edu:39618
uc3-c0[20] computes
- These nodes were built from a uct2 worker node image. First need to remove condor. Start on uc3-c001.
[root@uc3-c001 condor]# rpm -qa | grep condor
condor-7.6.4-1.x86_64
[root@uc3-c001 condor]# rpm -e condor-7.6.4-1.x86_64
Shutting down Condor (fast-shutdown mode)... done.
warning: /etc/condor/condor_config.local saved as /etc/condor/condor_config.local.rpmsave
warning: /etc/condor/condor_config saved as /etc/condor/condor_config.rpmsave
[root@uc3-c001 condor]#
- Download the repo package
- Then we have
[root@uc3-c001 yum.repos.d]# ls
cobbler-config.repo condor-development-rhel5.repo dell-omsa-repository.repo vdt.repo
- Then
yum install condor.x86_64
, InstallLogCondorCompute
compute node configurations
Files:
-
/etc/condor/condor_config
is the one configured for the MWT2
- Configuration directory for special custom configurations is
/etc/condor/config.d
, e.g. mathematica
- Local customizations are in
/etc/condor/condor_config.local
- Temporary per-note customizations can be made in
/etc/condor/condor_config.override
Notes on the setup:
- condor host is
uc3-mgt.mwt2.org
- domain is
osg-gk.mwt2.org
to allow MWT2 jobs to run using the same accounts as on MWT2
- all flocking jobs from different domains use the
uc3
account
-
nodecheck
is executed to set the node online/offline, pandaid
has been removed
- authentication forwarding from the Negotiator is enabled
The configuration of the mathematica nodes should be split and put in the directory
Optional configurations:
How to make Cycle seeder Condor pool visible outside the 10.x network
The headnode is listening already on all network interfaces but there may be problems in the IP sent when it is initiating the communication and in reaching the worker nodes.
The necessary steps are:
- Configure the primary IP as the public one:
# so that machines outside can contact it
# These commented lines should be the default
#COLLECTOR.BIND_ALL_INTERFACES = TRUE
#NEGOTIATOR.BIND_ALL_INTERFACES = TRUE
#BIND_ALL_INTERFACES = TRUE
NETWORK_INTERFACE = 129.93.229.141
- Configure
CONDOR_HOST=uc3-mgt.uchciago.edu
on the worker nodes.
To allow the Schedd to contact the worker nodes CCB should setup because the worker nodes are not reachable from outside the network (but they have outbound connectivity)
- On all the worker nodes (supposing that the Collector is also CCB):
CCB_ADDRESS = $(COLLECTOR_HOST)
- Give the condor_collector acting as the CCB server a large enough limit of file descriptors, e.g. twice 500*(1+1+8):
COLLECTOR.MAX_FILE_DESCRIPTORS = 10000
Optionally you can configure the worker nodes and the head node as being part of the same private 10.x network so that their communication goes all inside without crossing the router:
- Set the private network on the Collector/Negotiator:
PRIVATE_NETWORK_INTERFACE=10.1.3.95
PRIVATE_NETWORK_NAME=uc3-mgt.mwt2.org
- And the network name on the worker nodes:
PRIVATE_NETWORK_NAME=uc3-mgt.mwt2.org
The worker nodes (and the
uc3-sub
submit host) are already configured to trust authentications forwarded by the Negotiator:
# To allow transitive authentication
SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION = True
ALLOW_DAEMON = submit-side@matchsession
Local references:
References
--
RobGardner - 03 Mar 2012