Logo ROOT   6.10/00
Reference Guide
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
proof/doc/confman/UsingVirtualAnalysisFacility.md
Go to the documentation of this file.
1 Using the Virtual Analysis Facility
2 ===================================
3 
4 Introduction
5 ------------
6 
7 The Virtual Analysis Facility can be easily used by having installed on
8 your client the following software:
9 
10 - [ROOT](http://root.cern.ch/)
11 
12 - [PROOF on Demand](http://pod.gsi.de/)
13 
14 - The VAF client *(see below)*: a convenience tool that sets up the
15  environment for your experiment's software both on your client and
16  on the PROOF worker nodes
17 
18 > If you are the end user, you'll probably might skip the part that
19 > concerns how to configure the VAF client: your system administrator
20 > has probably and conveniently set it up for you.
21 
22 The Virtual Analysis Facility client
23 ------------------------------------
24 
25 The Virtual Analysis Facility client takes care of setting the
26 environment for the end user required by your software's experiment. The
27 environment will both be set on the client and on each PROOF node.
28 
29 Technically it is a Bash shell script which provides shortcuts for PROOF
30 on Demand commands and ensures local and remote environment consistency:
31 by executing it you enter a new clean environment where all your
32 software dependencies have already been set up.
33 
34 Local and remote environment configuration is split into a series of
35 files, which give the possibility to:
36 
37 - have a system-wide, sysadmin-provided experiment configuration
38 
39 - execute user actions either *before* or *after* the execution of the
40  system-wide script (for instance, choosing the preferred version of
41  the experiment's software)
42 
43 - transfer a custom user **payload** on each PROOF worker (for instance,
44  user's client-generated Grid credentials to make PROOF workers
45  capable of accessing a remote authenticated storage)
46 
47 Configuration files are searched for in two different locations:
48 
49 - a system-wide directory: `<client_install_dir>/etc`
50 
51 - user's home directory: `~/.vaf`
52 
53 > A system-wide configuration file always has precedence over user's
54 > configuration. It is thus possible for the sysadmin to enforce a
55 > policy where some scripts cannot ever be overridden.
56 
57 Thanks to this separation, users can maintain an uncluttered directory
58 with very simple configuration files that contain only what really needs
59 or is allowed to be customized: for instance, user might specify a single line
60 containing the needed ROOT version, while all the technicalities to set
61 up the environment are taken care of inside system-installed scripts,
62 leaving the user's configuration directory clean and uncluttered.
63 
64 ### Local environment configuration
65 
66 All the local environment files are loaded at the time of the
67 client's startup following a certain order
68 
69 - `common.before`
70 
71 - `local.before`
72 
73 - `local.conf`
74 
75 - `$VafConf_LocalPodLocation/PoD_env.sh`
76 
77 - `common.after`
78 
79 - `local.after`
80 
81 The `common.*` files are sourced both for the local and the remote
82 environment. This might be convenient to avoid repeating the same
83 configuration in different places.
84 
85 Each file is looked for first in the system-wide directory and then in
86 the user's directory. If a configuration file does not exist, it is
87 silently skipped.
88 
89 The `$VafConf_LocalPodLocation/PoD_env.sh` environment script, provided
90 with each PROOF on Demand installation, *must exist*: without this file,
91 the VAF client won't start.
92 
93 ### List of VAF-specific variables
94 
95 There are some special variables that need to be set in one of the above
96 configuration files.
97 
98 `$VafConf_LocalPodLocation`
99 : Full path to the PoD installation on the client.
100 
101  > The `$VafConf_LocalPodLocation` variable must be set before the
102  > `PoD_env.sh` script gets sourced, so set it either in
103  > `common.before`, `local.before` or `local.conf`. Since PoD is
104  > usually system-wide installed, its location is normally
105  > system-wide set in either the `local.conf` file by the system
106  > administrator.
107 
108 `$VafConf_RemotePodLocation`
109 : Full path to the PoD installation on the VAF master node.
110 
111  *Note: this variable should be set in the configuration files for
112  the local environment despite it refers to a software present on the
113  remote nodes.*
114 
115 `$VafConf_PodRms` *(optional)*
116 : Name of the Resource Management System used for submitting PoD jobs.
117  Run `pod-submit -l` to see the possible values.
118 
119  If not set, defaults to `condor`.
120 
121 `$VafConf_PodQueue` *(optional)*
122 : Queue name where to submit PoD jobs.
123 
124  If no queue has been given, the default one configured on your RMS
125  will be used.
126 
127 ### Remote environment configuration
128 
129 All the PoD commands sent to the VAF master will live in the environment
130 loaded via using the following scripts.
131 
132 Similarly to the local environment, configuration is split in different files
133 to allow for a system-wide configuration, which has precedence over
134 user's configuration in the home directory. If a script cannot be found,
135 it will be silently skipped.
136 
137 - `<output_of_payload>`
138 
139 - `common.before`
140 
141 - `remote.before`
142 
143 - `remote.conf`
144 
145 - `common.after`
146 
147 - `remote.after`
148 
149 For an explanation on how to pass extra data to the workers safely
150 through the payload, see below.
151 
152 ### Payload: sending local files to the remote nodes
153 
154 In many cases it is necessary to send some local data to the remote
155 workers: it is very common, for instance, to distribute a local Grid
156 authentication proxy on the remote workers to let them authenticate to
157 access a data storage.
158 
159 The `payload` file must be an executable generating some output that
160 will be prepended to the remote environment preparation. Differently
161 than the other environment scripts, it is not executed: instead, it is
162 first run, then *the output it produces will be executed*.
163 
164 Let's see a practical example to better understand how it works. We need
165 to send our Grid proxy to the master node.
166 
167 This is our `payload` executable script:
168 
169 ``` {.bash}
170 #!/bin/bash
171 echo "echo '`cat /tmp/x509up_u$UID | base64 | tr -d '\r\n'`'" \
172  "| base64 -d > /tmp/x509up_u\$UID"
173 ```
174 
175 This script will be executed locally, providing another "script line" as
176 output:
177 
178 ``` {.bash}
179 echo 'VGhpcyBpcyB0aGUgZmFrZSBjb250ZW50IG9mIG91ciBHcmlkIHByb3h5IGZpbGUuCg==' | base64 -d > /tmp/x509up_u$UID
180 ```
181 
182 This line will be prepended to the remote environment script and will be
183 executed before anything else on the remote node: it will effectively
184 decode the Base64 string back to the proxy file and write it into the
185 `/tmp` directory. Note also that the first `$UID` is not escaped and
186 will be substituted *locally* with your user ID *on your client
187 machine*, while the second one has the dollar escaped (`\$UID`) and will
188 be substituted *remotely* with your user ID *on the remote node*.
189 
190 > It is worth noting that the remote environment scripts will be sent to
191 > the remote node using a secure connection (SSH), thus there is no
192 > concern in placing sensitive user data there.
193 
194 Installing the Virtual Analysis Facility client
195 -----------------------------------------------
196 
197 ### Download the client from Git
198 
199 The Virtual Analysis Facility client is available on
200 [GitHub](https://github.com/dberzano/virtual-analysis-facility):
201 
202 ``` {.bash}
203 git clone git://github.com/dberzano/virtual-analysis-facility.git /dest/dir
204 ```
205 
206 The client will be found in `/dest/dir/client/bin/vaf-enter`: it is
207 convenient to add it to the `$PATH` so that the users might simply start
208 it by typing `vaf-enter`.
209 
210 ### Install the experiment's configuration files system-wide
211 
212 A system administrator might find convenient to install the experiment
213 environment scripts system-wide.
214 
215 Configuration scripts for LHC experiments are shipped with the VAF
216 client and can be found in
217 `/dest/dir/client/config-samples/<experiment_name>`. To make them used
218 by default by the VAF client, place them in the `/dest/dir/etc`
219 directory like this:
220 
221 ``` {.bash}
222 rsync -a /dest/dir/client/config-samples/<experiment_name>/ /dest/dir/etc/
223 ```
224 
225 Remember that the trailing slash in the source directory name has a
226 meaning in `rsync` and must not be omitted.
227 
228 > Remember that system-wide configuration files will always have
229 > precedence over user's configuration files, so *don't place there
230 > files that are supposed to be provided by the user!*
231 
232 Entering the Virtual Analysis Facility environment
233 --------------------------------------------------
234 
235 The Virtual Analysis Facility client is a wrapper around commands sent
236 to the remote host by means of PROOF on Demand's `pod-remote`. The VAF
237 client takes care of setting up passwordless SSH from your client node
238 to the VAF master.
239 
240 ### Getting the credentials
241 
242 > You can skip this paragraph if the remote server wasn't configured for
243 > HTTPS+SSH authentication.
244 
245 In our example we will assume that the remote server's name is
246 `cloud-gw-213.to.infn.it`: substitute it with your remote endpoint.
247 
248 First, check that you have your Grid certificate and private key
249 installed both in your browser and in the `~/.globus` directory of your
250 client.
251 
252 Point your browser to `https://cloud-gw-213.to.infn.it/auth/`: you'll
253 probably be asked for a certificate to choose for authentication. Pick
254 one and you'll be presented with the following web page:
255 
256 ![Web authentication with sshcertauth](img/sshcertauth-web.png)
257 
258 The webpage clearly explains you what to do next.
259 
260 ### Customizing user's configuration
261 
262 Before entering the VAF environment, you should customize the user's
263 configuration. How to do so depends on your experiment, but usually you
264 should essentially specify the version of the experiment's software you
265 need.
266 
267 For instance, in the CMS use case, only one file is needed:
268 `~/.vaf/common.before`, which contains something like:
269 
270 ``` {.bash}
271 # Version of CMSSW (as reported by "scram list")
272 export VafCmsswVersion='CMSSW_5_3_9_sherpa2beta2'
273 ```
274 
275 ### Entering the VAF environment
276 
277 Open a terminal on your client machine (can be either your local
278 computer or a remote user interface) and type:
279 
280  vaf-enter <username>@cloud-gw-213.to.infn.it
281 
282 You'll substitute `<username>` with the username that either your system
283 administrator or the web authentication (if you used it) provided you.
284 
285 You'll be presented with a neat shell which looks like the following:
286 
287  Entering VAF environment: dberzano@cloud-gw-213.to.infn.it
288  Remember: you are still in a shell on your local computer!
289  pod://dberzano@cloud-gw-213.to.infn.it [~] >
290 
291 This shell runs on your local computer and it has the environment
292 properly set up.
293 
294 PoD and PROOF workflow
295 ----------------------
296 
297 > The following operations are valid inside the `vaf-enter` environment.
298 
299 ### Start your PoD server
300 
301 With PROOF on Demand, each user has the control of its own personal
302 PROOF cluster. The first thing to do is to start the PoD server and the
303 PROOF master like this:
304 
305  vafctl --start
306 
307 A successful output will be similar to:
308 
309  ** Starting remote PoD server on dberzano@cloud-gw-213.to.infn.it:/cvmfs/sft.cern.ch/lcg/external/PoD/3.12/x86_64-slc5-gcc41-python24-boost1.53
310  ** Server is started. Use "pod-info -sd" to check the status of the server.
311 
312 ### Request and wait for workers
313 
314 Now the server is started but you don't have any worker available. To
315 request for `<n>` workers, do:
316 
317  vafreq <n>
318 
319 To check how many workers became available for use:
320 
321  pod-info -n
322 
323 To continuously update the check (`Ctrl-C` to terminate):
324 
325  vafcount
326 
327 Example of output:
328 
329  Updating every 5 seconds. Press Ctrl-C to stop monitoring...
330  [20130411-172235] 0
331  [20130411-172240] 0
332  [20130411-172245] 12
333  [20130411-172250] 12
334  ...
335 
336 To execute a command after a certain number of workers is available (in
337 the example we wait for 5 workers then start ROOT):
338 
339  vafwait 5 && root -l
340 
341 > Workers take some time before becoming available. Also, it is possible
342 > that not all the requested workers will be satisfied.
343 
344 ### Start ROOT and use PROOF
345 
346 When you are satisfied with the available number of active workers, you
347 may start your PROOF analysis. Start ROOT, and from its prompt connect
348 to PROOF like this:
349 
350  root [0] TProof::Open("pod://");
351 
352 Example of output:
353 
354  Starting master: opening connection ...
355  Starting master: OK
356  Opening connections to workers: OK (12 workers)
357  Setting up worker servers: OK (12 workers)
358  PROOF set to parallel mode (12 workers)
359 
360 ### Stop or restart your PoD cluster
361 
362 At the end of your session, remember to free the workers by stopping
363 your PoD server:
364 
365  vafctl --stop
366 
367 > PoD will stop the PROOF master and the workers after detecting they've
368 > been idle for a certain amount of time anyway, but it is a good habit
369 > to stop it for yourself when you're finished using it, so that you are
370 > immediately freeing resources and let them be available for other
371 > users.
372 
373 In case of a major PROOF failure (i.e., crash), you can simply restart
374 your personal PROOF cluster by running:
375 
376  vafctl --start
377 
378 PoD will stop and restart the PROOF master. You'll need to request the
379 workers again at this point.
TArc * a
Definition: textangle.C:12
constexpr std::array< decltype(std::declval< F >)(std::declval< int >))), N > make(F f)
static double A[]
void example()
Definition: example.C:26
void run(bool only_compile=false)
Definition: run.C:1
int type
Definition: TGX11.cxx:120
TCanvas * slash()
Definition: slash.C:1
#define dest(otri, vertexptr)
Definition: triangle.c:1040
static char * skip(char **buf, const char *delimiters)
Definition: civetweb.c:2039
char name[80]
Definition: TGX11.cxx:109