This chapter is about installing Youpi in a GNU/Linux environment. Actually, it should work on most UNIX-like systems that support Python (quite a lot). For example, Youpi is currently hosted on a FreeBSD server at TERAPIX.
For the moment, two configurations can be considered while setting up a Youpi environment:
you can decide to perform a standalone installation in order to install everything, i.e. Youpi and all its dependencies, on the same Linux host. This way it is possible to reduce data with tools installed locally (while accessing your images remotely, for example over NFS), which can be useful if you just want to give Youpi a try or if you don’t plan to setup a more complex installation involving several cluster nodes for processing your data.
Note
We provide a standalone installation on a Live DVD , available from Youpi’s website. This is the quickest and easiest way to give Youpi a try. All software packages are installed and are already configured for immediate use. Everything is loaded at runtime into your computer’s memory, so be sure to have a rather powerful machine (with tons of memory!).
Most often, you will want to perform a network or cluster installation if you want to be able to process your images on a computer cluster. The cluster installation steps are lighter than those required for a fully operational standalone installation because there are much less software requirements. Most of the packages have only to be installated on the cluster nodes, while the computer running Youpi only needs Condor and some Python-related software packages.
Note
In any case, you always need to have a running Condor installation on the host running Youpi. Indeed, Youpi needs to be able to submit jobs on the cluster (the condor_submit command is used internally to submit jobs).
The latest version of Youpi is available in the download section of Youpi’s website.
As of today, running Youpi involves installing many software packages. Optional packages will enhance Youpi’s functionalities and will improve the final user experience. As explained at the beginning of this chapter, the list of packages you need to install depends on the type of installation - standalone or cluster - you want to perform.
Here is a partial dependency tree to better visualize Youpi’s software dependencies. All those packages are required and must be installed on a computer host for a standalone installation:
Warning
Since Youpi mainly depends on Python, we highly recommand to install Python first. This way, you can specify (force) which version of Python you want to install so that all required packages using it will depend on a Python version supported by Youpi.
As specified in the Software packages section, here are the packages you must install:
The cluster installation process is slightly lighter than the standalone one. Only the following software packages are needed on the computer hosting Youpi:
As specified in the Software packages section, here are the packages you must install:
Here is a list of all packages involved in a Youpi installation. Some of them may not be required if you perform a cluster installation:
| Software package | Supported version | Description |
|---|---|---|
| Firefox | 2.0 and later | We fully support this web browser at the moment. Safari is reported to be working and IE is not supported. |
| Python | 2.5.x or 2.6.x only | General-purpose high-level programming language |
| Apache | 2.x | Apache HTTP server project |
| Apache mod_wsgi | 2.3 and later | Apache module which can host any Python application which supports the Python WSGI interface interpreter within the server |
| Fitsverify | all | Program that rigorously checks whether a FITS format data file conforms to all the requirements defined in the FITS (Flexible Image Transport System) Standard document |
| MySQL | 5.x and later | Relational database management system (RDBMS) |
| Django | 1.1.x | Python web framework |
| Condor | 7.2.x and later | High Throughput Computing (HTC) on large collections of distributively owned computing resources |
| Python cjson | 1.0.5 | Implements a very fast JSON encoder/decoder for Python. |
| PyFITS | 1.3 and later | Provides an interface to FITS formatted files under the Python scripting language |
| Python matplotlib | 0.98 and later | Python 2D plotting library |
| Python magic | 5.03 and later | Python module for determing file type |
| MySQL Python | 1.2.x and later | Python interface to MySQL |
| Geos | 3.0.x and later | Library for performing geometric operations used by GeoDjango |
| NumPy | 1.3.x and later | Fundamental package needed for scientific computing with Python |
| ImageMagick | 6.x and later | Software suite to create, edit, and compose bitmap images |
| CFITSIO | 3.x and later | Youpi uses the imcopy tool internally (part of the FITS utility programs bundled with CFITSIO) |
| Condor tranfer | Any version distributed with Youpi | Youpi comes with a condor_transfer.pl script which is used to optimize data transfer between hosts on a cluster. This perl script uses the libparallel-forkmanager-perl package. You have to set the CMD_CONDOR_TRANSFER variable in your settings.py file. |
| cURL | 7.x and later | Command line tool for transferring files with URL syntax, supporting various protocols. |
| QualityFITS | 1.13.12 and later | Open source Quality Assessment software used at Terapix |
| WCS library | 4.2 and later | Library for the FITS “World Coordinate System” (WCS) |
| PSFEx | 2.4.2 and later | PSFEx stands for “PSF Extractor”: a software that makes PSF models for use with the SExtractor program |
| WeightWatcher | 1.8.8 and later | program that combines weight-maps, flag-maps and polygon data in order to produce control maps which can directly be used in astronomical image-processing packages like Drizzle, Swarp or SExtractor |
| Scamp | 1.4.x and later | Reads SExtractor catalogs and computes astrometric and photometric solutions for any arbitrary sequence of FITS images in a completely automatic way |
| Swarp | 2.17.x and later | Resamples and co-adds together FITS images using any arbitrary astrometric projection |
| Sextractor | 2.8.6 and later | Builds a catalogue of objects from an astronomical image |
| Stiff | 1.12 and later | Converts scientific FITS images to the more popular TIFF format for illustration purposes |
This section will guide through the Youpi installation process.
As an example here is the command one would issue to install most components using a Ubuntu GNU/Linux distribution. For a standalone installation, you should issue the following command:
$ sudo apt-get install python2.5 apache2 libapache2-mod-python mysql-server \
> mysql-client python-django python-cjson python-matplotlib python-magick \
> python-mysqldb python-numpy imagemagick libparallel-forkmanager-perl curl
While for a cluster installation, you might want to try:
$ sudo apt-get install python2.5 apache2 libapache2-mod-python mysql-client \
> python-django python-cjson python-matplotlib python-magick python-mysqldb \
> python-numpy
As you can see, not all dependencies are available through the Ubuntu’s debian-like dpkg facility because they are not packaged for that Linux distribution. Thus, it’s up to you to install the remaining sofware, either from the source (for CFITSIO) or from the rpm binary packages available for Condor, QualityFITS, Scamp, Swarp, Sextractor and Stiff, all available for download from their respective websites (see the Software Packages section).
Note
Debian-like users (such as Ubuntu ones) can use tools like alien to convert rpm binary packages to deb binary archives that you can later install with the dpkg tool.
Once you get the source code of Youpi, uncompress the tarball in the directory of your choice, which may be your home directory. Let’s use your $HOME environement variable as the base directory for installation:
$ cd
$ tar xvjf youpi-x.x.tar.gz
Warning
For security reasons, decompressing the tarball in a directory under your Apache’s DocumentRoot is not recommanded. Further information if available in the official Django’s documentation.
A youpi directory has just been created in your home dir. Let’s set $YOUPI_INSTALL_DIR to point to your newly created youpi directory. We will use this variable later to refer to your Youpi’s installation base directory:
$ export YOUPI_INSTALL_DIR=$HOME/youpi
Now that everything is installed, a little more work is needed to get the things up and running. You are advise to follow the configuration order suggested in the following sections since they depend on each other.
First, you have to configure your MySQL server to allow using a dedicated database for Youpi. Connect to the server with a privileged user account and issue the following commands:
mysql> grant all privileges on youpi.* to youpiuser@'%' identified by 'secret';
mysql> flush privileges;
mysql> create database youpi;
All access privileges are granted to the youpi database and you can access it with the youpiuser login and secret password. The youpiuser@'%' part means you can access your database from any host, which may not be what you want. To improve security, you may consider specifying a more restrictive rule. You might want to check the MySQL Reference Manual for further details.
You may want to check if you can connect to your newly created database from MySQL command line client:
$ mysql -u youpiuser -p youpi
Enter password: ****
mysql>
Your Condor installation and configuration depends on your Youpi’s installation kind. Configuring Condor itself is outside the scope of this document, but you can browse the official Condor documentation instead.
The most important requirement for Youpi to work properly is to be able to submit jobs from the machine hosting the Youpi installation. The condor_q command must be in your $PATH.
Now that your database is properly configured, you have to setup Youpi’s Django installation. All the configuration options are available in the settings.py-dist file distributed with Youpi. Most parts of this file will remain untouched but some variables have to be setup properly in order to match your installation environment:
$ cd $YOUPI_INSTALL_DIR/terapix
$ cp settings.py-dist settings.py
You now have to tailor your settings.py‘s contents with appropriate values for all lines marked with a #FIXME comment. Here is the list of important variables you need to check:
Any options you might want to set on the CMD_CONDOR_TRANSFER command line. For more information, see the man page:
$ $YOUPI_INSTALL_DIR/tools/condor_transfer.pl --man
URL for FTP transfer. To maximize performance during network transfers, all results data are transferred back to the PROCESSING_OUTPUT directory using the FTP protocol with the condor_transfer.pl tool distributed with Youpi. FTP_URL should only contain the name of the target host (running a FTP server) that will receive the processing output data. For example, if your PROCESSING_OUTPUT directory is located on your data.example.org host, just set your FTP_URL this way:
FTP_URL = 'ftp://data.example.org/'
Must point to your web server responsible for serving your HTML results over HTTP. For example, all QualityFITS results will belong to a PROCESSING_OUTPUT/USER/fitsin/OUTPUT_DIRECTORY/IMAGE_NAME/qualityFITS/ directory. PROCESSING_OUTPUT is the settings.py parameter you just configured while OUTPUT_DIRECTORY and IMAGE_NAME are variables used by the QualityFITS plugin. So, in order to serve those results and make them available fro, Youpi in your web browser, you will have to configure your web server so that it servers content from the PROCESSING_OUTPUT directory. Let’s say you want to serve all Youpi results stored in /var/youpi/results/ (your PROCESSING_OUTPUT directory) from your data.example.org host with your Apache web server listening on port 9000. Then you might want to add this entry in your Apache configuration file:
Listen 9000
<VirtualHost *:9000>
DocumentRoot "/var/youpi/results/"
<Directory "/var/youpi/results/">
Options Indexes MultiViews FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
The same applies for WWW_FITSIN_SCAMP, WWW_FITSIN_SEX and WWW_FITSIN_SWARP. As written in the default settings.py-dist file, they have the same value as WWW_FITSIN_PREFIX, which is the default behaviour since they share the same PROCESSING_OUTPUT directory.
Whenever INGESTION_HOST_PATTERN matches your path-to-images directory, a real hostname will be searched in the INGESTION_HOST_MAPPING dictionary. If the precedent pattern matches a key in this dictionary, the corresponding value will be used as a Condor target hostname requirement. For example, if your path to input images is something like /path/to/root/directory/host5/testing/megacam/, then you could set the following variables in your local_conf.py:
INGESTION_HOST_PATTERN = r'^/(.*?)/.*$'
INGESTION_HOST_MAPPING = {'host5': 'host5.mydomain.org'}
# Keys can also be declared as regexps
#INGESTION_HOST_MAPPING = {r'host.*': '\1.mydomain.org'}
Since host5 will match, the corresponding value host5.mydomain.org will be used. You can also use regular expressions. In this case, \1 matches the current hostname value matched.
This is the path of your Youpi installation, put the value of your $YOUPI_INSTALL_DIR here:
TRUNK = '/home/user/youpi'
New in version 0.6.1.
The command line to use Yahoo’s YUI Java compressor to minify static CSS and Javascript files. Please note that this parameter is not required but is highly recommanded if you want to increase the application’s performance in production.
That’s it. You are done with the basic configuration variables. The remaining of the settings.py file can be left untouched.
Instead of heavily modifying the settings.py file directly, it is also possible to overwrite its variables using a separate local_conf.py file, which will be imported automatically by Django when processing settings.py. This is a good place to set your database configuration settings for example. A tipically local_conf.py file may look like this:
CMD_CONDOR_TRANSFER = '/usr/local/bin/condor_transfert.pl'
CMD_FITSVERIFY = 'fitsverify'
# Testing database (do not use for production)
DATABASE_NAME = 'youpitest'
DATABASE_USER = 'debug'
DATABASE_PASSWORD = 'debug'
DATABASE_HOST = 'db.example.org'
#
FILE_BROWSER_HEADER_TITLE = 'Cluster Path Browser'
FILE_BROWSER_ROOT_TITLE = 'Network Filesystem'
FILE_BROWSER_ROOT_DATA_PATH = '/mnt/nfs'
# Email sent when a job fails
CONDOR_NOTIFY_USER = 'monnerville@iap.fr'
INGESTION_HOST_PATTERN = r'^/(.*?)/.*$'
INGESTION_HOSTS_MAPPING = {r'.*ix.*': '\1.domain.org'}
Warning
Some configuration variables must not be overwritten in a local_conf.py file since other variables defined in settings.py may depend on their definition. All variables can be defined in local_conf.py except for TRUNK, CMD_CONVERT, CONVERT_THUMB_OPT, FTP_URL, PROCESSING_OUTPUT and WWW_FITSIN_PREFIX.
Now, give ownership to your web server’s user:
$ chown -R www-data:www-data $YOUPI_INSTALL_DIR
In your case, this may not be www-data but apache or www depending on your Linux distribution. Just look at the entries in your /etc/password and /etc/group system files.
Now it’s time to run first-time installation checks involving creating the database structure, then populating it with initial data and creating a database admin account. Most of Youpi management is done from the $HOME/youpi/terapix directory using the manage.py command line script.
In order to create the database structure, ask the Django manager to issue the syncdb command:
$ cd $YOUPI_INSTALL_DIR/terapix
$ python manage.py syncdb
This will create all the database tables required by Youpi. It will also ask for a login, a valid email address and a password in order to create an active admin account. Only active accounts are allowed to authenticate and enter the Youpi application. Populating the database with initial data and running some safety checks involves the checksetup command:
$ python manage.py checksetup
Note
This command is non-destructive: you can use it whenever you want. It will only check that everything in the database is setup properly and will not alter or damage your data in any way.
Deploying Youpi into production using Apache is easy since Youpi comes with handy tools to help you get ready easily. From the $YOUPI_INSTALL_DIR/terapix directory, just issue the following command:
$ python manage.py checksetup --wsgi --apache
The --wsgi option will create a $YOUPI_INSTALL_DIR/deploy/django.wsgi configuration file ready to use - with values matching your current Youpi installation - with any web server supporting the WSGI (Web Server Gateway Interface) specification.
The --apache option will generate a $YOUPI_INSTALL_DIR/deploy/youpi.conf configuration file ready to use with your Apache web server. You can either copy this file to your web server’s configuration files directory or copy its contents to your httpd.conf file.
Now just reload Apache before testing your installation:
$ sudo /etc/init.d/apache2 reload
Congratulations! The configuration step is over :)
At this point, you are ready to run Youpi. Just point your browser to the web server running Youpi, say http://myhost/youpi/. The login page should appear. If not, you may start checking your server’s logs and reread all the configuration steps in this chapter.
New in version 0.6.1.
This version comes with new features regarding performance improvement when Youpi is deployed on a production server (See the ChangeLog). Before that, a lot of HTTP requests were made every time a page was loaded. Many pages required that the browser issued more than 30(!) HTTP requests in order to download all the required content to render the page!
Even if browser caching is enabled (default), those 30+ HTTP queries will continue to be sent to the server(!) on every page reload. The browser will then receive a “304 Not Modified” HTTP response for all cached content.
Big performance improvements can be made with the following techniques:
Make fewer HTTP requests
It’s possible by combining all Javascript scripts used on the page into a single JS file. The same applies for CSS stylesheets. Moreover, with minified versions, page loading can be very fast. Youpi includes tools to generate static files combination and minification in a fully automatic way.
In order to generate those files, make sure you have set the COMPRESS_YUI_BINARY variable correctly (as explained in the Release Notes) in your settings.py (or local_conf.py) file and that the COMPRESS variable is undefined or set to True. If everything looks good, you can generate the files from the command line:
$ python manage.py synccompress --forceThose newly generated static files will be used automatically by Django when rendering templates.
Add Expires headers
A first-time visit to a page may require several HTTP requests to load all the components. By using Expires headers these components become cacheable, which avoids unnecessary HTTP requests on subsequent page views. Assuming that Apache’s mod_expires module is available (which is generally the case) and that the directory holding Youpi’s static files is /var/www/youpi/terapix/media, you can add the following directives to your Apache configuration file:
<Directory "/var/www/youpi/terapix/media"> # "Far future expire" technique ExpiresActive on ExpiresDefault "access plus 1 week" </Directory>Note
You may want to set up a different ExpiresDefault value, but one week before cache expiration should be just fine.
Compress static components with gzip
Compression reduces response times by reducing the size of the HTTP response. Apache’s mod_deflate compression generally reduces the response size by about 70%! Assuming that Apache’s mod_deflate module is available (which is generally the case) and that the directory holding Youpi’s static files is /var/www/youpi/terapix/media, you can add the following directives to your Apache configuration file:
<Directory "/var/www/youpi/terapix/media"> # Enables on-the-fly compression SetOutputFilter DEFLATE BrowserMatch ^Mozilla/4 gzip-only-text/html BrowserMatch ^Mozilla/4\.0[678] no-gzip BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html SetEnvIfNoCase Request_URI \ \.(?:gif|jpe?g|png|swf|pdf|t?gz|zip|bz2|rar)$ no-gzip dont-vary </Directory>Note
Note that Django templates (dynamic content) are already gzip’ed in Youpi using the django.middleware.gzip.GZipMiddleware middleware.