Logo ROOT   6.10/00
Reference Guide
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
proof/doc/confman/ConfigProofPoD.md
Go to the documentation of this file.
1 Setup a static PROOF cluster with PROOF on Demand
2 =================================================
3 
4 Introduction
5 ------------
6 
7 Using PROOF on Demand is our current recommended way of running a PROOF
8 cluster. The usage of PoD is in particular helpful for the following
9 reasons:
10 
11 - **Sandboxing.** Each user get their own personal PROOF cluster,
12  separated from the others: a problem occurring on one personal
13  cluster does not affect the workflow of other users.
14 
15 - **Easier administration and self-servicing.** A user can restart their
16  personal PROOF cluster in case of troubles without waiting for a
17  system administrator's intervention.
18 
19 - **Efficient multiuser scheduling.** PROOF on Demand makes PROOF run on
20  top of an existing resource management system, moving the problem of
21  scheduling many concurrent users outside of PROOF.
22 
23 This guide particularly refers to the setup of a static PROOF cluster
24 running on physical hosts: the recommended setup is in practice the same
25 as the ready-to-go Virtual Analysis Facility. If you want to use PROOF
26 on the clouds there is no configuration to go through.
27 
28 Setup a resource management system
29 ----------------------------------
30 
31 Although PROOF on Demand can run on a cluster of nodes without using a
32 resource management system (using `pod-ssh`), it is recommended to setup a
33 dedicated one to benefit from the scheduling in a multiuser environment, or a
34 dedicated queue on an existing one.
35 
36 As there's a variety of resource management systems, this guide does not cover
37 their setup. The RMS preconfigured for the Virtual Analysis Facility is
38 [HTCondor](http://research.cs.wisc.edu/htcondor/), which we recommend primarily
39 because it has dynamic addition of workers built in.
40 
41 Configuration steps for all nodes
42 ---------------------------------
43 
44 ### Setup CernVM-FS
45 
46 [CernVM-FS](http://cernvm.cern.ch/portal/filesystem) should be installed
47 on all machines as the preferred method for software distribution.
48 
49 > Configuration instructions for the latest CernVM-FS can be found
50 > [here](http://cernvm.cern.ch/portal/filesystem/techinformation).
51 
52 A brief step-by-step procedure to install CernVM-FS is hereby described.
53 
54 - Download and install the latest stable version from
55  [here](http://cernvm.cern.ch/portal/filesystem): pick one which is
56  appropriate to your operating system. You need the `cvmfs` package,
57  you *don't* need the `cvmfs-devel` or `cvmfs-server` ones.
58 
59 - As root user, run:
60 
61  # cvmfs_config setup
62 
63 - Start the `autofs` service: how to to this depends on your operating
64  system.
65 
66  On Ubuntu using Upstart:
67 
68  # restart autofs
69 
70  On RHEL-based or older Ubuntus:
71 
72  # service autofs restart
73 
74 - Prepare a `/etc/cvmfs/default.local` file (create it if it does not
75  exists) with the following configuration bits:
76 
77  ``` {.bash}
78  CVMFS_HTTP_PROXY=http://your-proxy-server.domain.ch:3128,DIRECT
79  CVMFS_REPOSITORIES=your-experiment.cern.ch,sft.cern.ch
80  CVMFS_QUOTA_LIMIT=50000
81  ```
82 
83  You need to properly specify your closest HTTP caching proxy:
84  separate many of them via commas. The last fallback value, `DIRECT`,
85  tells cvmfs to connect directly without using any proxy at all.
86 
87  Among the list of repositories (comma-separated), always specify
88  `sft.cern.ch` and the one containing the software to your experiment
89  (e.g., `cms.cern.ch`).
90 
91  The quota limit is, in Megabytes, the amount of local disk space to
92  use as cache.
93 
94 - Check the configuration and repositories with:
95 
96  # cvmfs_config chksetup
97  OK
98  # cvmfs_config probe
99  Probing /cvmfs/cms.cern.ch... OK
100  Probing /cvmfs/sft.cern.ch... OK
101 
102 > You might need special configurations for some custom software
103 > repositories! Special cases are not covered in this guide.
104 
105 ### Firewall configuration
106 
107 [PROOF on Demand](http://pod.gsi.de/) is very flexible in handling
108 various cases of network topologies. The best solution would be to allow
109 all TCP communications between the cluster machines.
110 
111 No other incoming communication is required from the outside.
112 
113 Configuration steps for the head node only
114 ------------------------------------------
115 
116 ### Setup HTTPS+SSH (sshcertauth) authentication
117 
118 > Latest recommended sshcertauth version is 0.8.5.
119 >
120 > [Download](https://github.com/dberzano/sshcertauth/archive/v0.8.5.zip)
121 > and [read the
122 > instructions](http://newton.ph.unito.it/~berzano/w/doku.php?id=proof:sshcertauth).
123 
124 If you want your users to connect to the PROOF cluster using their Grid
125 user certificate and private key you might be interested in installing
126 sshcertauth. Please refer to the [installation
127 guide](http://newton.ph.unito.it/~berzano/w/doku.php?id=proof:sshcertauth)
128 for further information.
129 
130 ### PROOF on Demand
131 
132 > Latest recommended PROOF on Demand version is 3.12.
133 >
134 > **On CernVM-FS:** `/cvmfs/sft.cern.ch/lcg/external/PoD/3.12`
135 >
136 > **Source code:** [PoD download page](http://pod.gsi.de/download.html)
137 > and [Installation
138 > instructions](http://pod.gsi.de/doc/3.12/Installation.html)
139 
140 [PROOF on Demand](http://pod.gsi.de/) is required on the head node and on the
141 user's client.
142 
143 In case your experiment provides a version of PoD on CernVM-FS you can use
144 that one. Experiment-independent versions are available from the PH-SFT
145 cvmfs repository.
146 
147 Only if you have specific reasons while you want to use a customly built
148 PoD version, download the source code and compile it using the
149 installation instructions.
150 
151 Please note that [CMake](http://www.cmake.org/) and
152 [Boost](http://www.boost.org/) are required to build PoD.
153 
154 - After you have built PoD, install it with:
155 
156  make install
157 
158 - After installing PoD, run:
159 
160  pod-server getbins
161 
162  This has to be done only once and downloads the binary packages that
163  will be dynamically transferred to the worker nodes as binary
164  payload, and prevents us from installing PoD on each cluster node.
165 
166  It is important to do this step now, because in case PoD has been
167  installed in a directory where the user has no write privileges, as
168  in the case of system-wide installations, the user won't be able to
169  download those required packages in the PoD binary directory.
170 
171 > There is no need to "configure" PoD for your specific cluster: it is
172 > just enough to install it on your head node.
173 >
174 > PoD does not have any system-wide persistent daemon running or any
175 > system-wide configuration to be performed. Also, no part of PoD will
176 > be ever run as root.
177 >
178 > Do not worry about environment or software configuration at this time:
179 > there is no system configuration for that. All the environment for
180 > your software dependencies will be set via proper scripts from the PoD
181 > client.
182 >
183 > PoD client configuration and running is properly covered in the
184 > appropriate manual page.
185 
186 ### Firewall configuration
187 
188 The head node only requires **TCP ports 22 (SSH) and 443 (HTTPS)** to accept
189 connections from the outside. Users will get an authentication "token"
190 from port 443 and all PROOF traffic will be automatically tunneled in a
191 SSH connection on port 22 by PoD.
192 
193 In case you are not using the HTTPS+SSH token+authentication method, access to
194 the sole port 22 is all you need.
double write(int n, const std::string &file_name, const std::string &vector_type, int compress=0)
writing
const char * Setup
Definition: TXMLSetup.cxx:48
TString as(SEXP s)
Definition: RExports.h:71
TArc * a
Definition: textangle.C:12
constexpr std::array< decltype(std::declval< F >)(std::declval< int >))), N > make(F f)
static double A[]
Double_t RMS(Long64_t n, const T *a, const Double_t *w=0)
Definition: TMath.h:1065
void run(bool only_compile=false)
Definition: run.C:1