2.1. Installing Youpi

This chapter is about installing Youpi in a GNU/Linux environment. Actually, it should work on most UNIX-like systems that support Python (quite a lot). For example, Youpi is currently hosted on a FreeBSD server at TERAPIX.

For the moment, two configurations can be considered while setting up a Youpi environment:

  • you can decide to perform a standalone installation in order to install everything, i.e. Youpi and all its dependencies, on the same Linux host. This way it is possible to reduce data with tools installed locally (while accessing your images remotely, for example over NFS), which can be useful if you just want to give Youpi a try or if you don’t plan to setup a more complex installation involving several cluster nodes for processing your data.

    Note

    We provide a standalone installation on a Live DVD , available from Youpi’s website. This is the quickest and easiest way to give Youpi a try. All software packages are installed and are already configured for immediate use. Everything is loaded at runtime into your computer’s memory, so be sure to have a rather powerful machine (with tons of memory!).

  • Most often, you will want to perform a network or cluster installation if you want to be able to process your images on a computer cluster. The cluster installation steps are lighter than those required for a fully operational standalone installation because there are much less software requirements. Most of the packages have only to be installated on the cluster nodes, while the computer running Youpi only needs Condor and some Python-related software packages.

    Note

    In any case, you always need to have a running Condor installation on the host running Youpi. Indeed, Youpi needs to be able to submit jobs on the cluster (the condor_submit command is used internally to submit jobs).

2.1.1. Getting Youpi

The latest version of Youpi is available in the download section of Youpi’s website.

2.1.2. Software Dependencies

As of today, running Youpi involves installing many software packages. Optional packages will enhance Youpi’s functionalities and will improve the final user experience. As explained at the beginning of this chapter, the list of packages you need to install depends on the type of installation - standalone or cluster - you want to perform.

2.1.2.1. Standalone Installation

Here is a partial dependency tree to better visualize Youpi’s software dependencies. All those packages are required and must be installed on a computer host for a standalone installation:

Warning

Since Youpi mainly depends on Python, we highly recommand to install Python first. This way, you can specify (force) which version of Python you want to install so that all required packages using it will depend on a Python version supported by Youpi.

As specified in the Software packages section, here are the packages you must install:

  • Apache
  • Apache mod_wsgi
  • CFITSIO
  • Condor
  • Condor transfer script
  • cURL
  • Django
  • Fitsverify
  • Firefox
  • Geos
  • ImageMagick
  • MySQL client and server
  • Mysql-Python
  • NumPy
  • PSFEx
  • Python
  • Python cjson
  • Python matplotlib
  • Python magic
  • QualityFITS
  • Scamp
  • Sextractor
  • Stiff
  • Swarp
  • WCS library
  • WeightWatcher

2.1.2.2. Cluster Installation

The cluster installation process is slightly lighter than the standalone one. Only the following software packages are needed on the computer hosting Youpi:

As specified in the Software packages section, here are the packages you must install:

  • Apache
  • Apache mod_wsgi
  • Condor
  • Django
  • Firefox
  • Geos
  • MySQL client only
  • Mysql-Python
  • NumPy
  • Python
  • Python cjson
  • Python matplotlib
  • Python magic

2.1.2.3. Software packages

Here is a list of all packages involved in a Youpi installation. Some of them may not be required if you perform a cluster installation:

Software package Supported version Description
Firefox 2.0 and later We fully support this web browser at the moment. Safari is reported to be working and IE is not supported.
Python 2.5.x or 2.6.x only General-purpose high-level programming language
Apache 2.x Apache HTTP server project
Apache mod_wsgi 2.3 and later Apache module which can host any Python application which supports the Python WSGI interface interpreter within the server
Fitsverify all Program that rigorously checks whether a FITS format data file conforms to all the requirements defined in the FITS (Flexible Image Transport System) Standard document
MySQL 5.x and later Relational database management system (RDBMS)
Django 1.1.x Python web framework
Condor 7.2.x and later High Throughput Computing (HTC) on large collections of distributively owned computing resources
Python cjson 1.0.5 Implements a very fast JSON encoder/decoder for Python.
PyFITS 1.3 and later Provides an interface to FITS formatted files under the Python scripting language
Python matplotlib 0.98 and later Python 2D plotting library
Python magic 5.03 and later Python module for determing file type
MySQL Python 1.2.x and later Python interface to MySQL
Geos 3.0.x and later Library for performing geometric operations used by GeoDjango
NumPy 1.3.x and later Fundamental package needed for scientific computing with Python
ImageMagick 6.x and later Software suite to create, edit, and compose bitmap images
CFITSIO 3.x and later Youpi uses the imcopy tool internally (part of the FITS utility programs bundled with CFITSIO)
Condor tranfer Any version distributed with Youpi Youpi comes with a condor_transfer.pl script which is used to optimize data transfer between hosts on a cluster. This perl script uses the libparallel-forkmanager-perl package. You have to set the CMD_CONDOR_TRANSFER variable in your settings.py file.
cURL 7.x and later Command line tool for transferring files with URL syntax, supporting various protocols.
QualityFITS 1.13.12 and later Open source Quality Assessment software used at Terapix
WCS library 4.2 and later Library for the FITS “World Coordinate System” (WCS)
PSFEx 2.4.2 and later PSFEx stands for “PSF Extractor”: a software that makes PSF models for use with the SExtractor program
WeightWatcher 1.8.8 and later program that combines weight-maps, flag-maps and polygon data in order to produce control maps which can directly be used in astronomical image-processing packages like Drizzle, Swarp or SExtractor
Scamp 1.4.x and later Reads SExtractor catalogs and computes astrometric and photometric solutions for any arbitrary sequence of FITS images in a completely automatic way
Swarp 2.17.x and later Resamples and co-adds together FITS images using any arbitrary astrometric projection
Sextractor 2.8.6 and later Builds a catalogue of objects from an astronomical image
Stiff 1.12 and later Converts scientific FITS images to the more popular TIFF format for illustration purposes

2.1.3. Installation

This section will guide through the Youpi installation process.

2.1.3.1. Installation on Ubuntu Linux

As an example here is the command one would issue to install most components using a Ubuntu GNU/Linux distribution. For a standalone installation, you should issue the following command:

$ sudo apt-get install python2.5 apache2 libapache2-mod-python mysql-server \
> mysql-client python-django python-cjson python-matplotlib python-magick \
> python-mysqldb python-numpy imagemagick libparallel-forkmanager-perl curl

While for a cluster installation, you might want to try:

$ sudo apt-get install python2.5 apache2 libapache2-mod-python mysql-client \
> python-django python-cjson python-matplotlib python-magick python-mysqldb \
> python-numpy

As you can see, not all dependencies are available through the Ubuntu’s debian-like dpkg facility because they are not packaged for that Linux distribution. Thus, it’s up to you to install the remaining sofware, either from the source (for CFITSIO) or from the rpm binary packages available for Condor, QualityFITS, Scamp, Swarp, Sextractor and Stiff, all available for download from their respective websites (see the Software Packages section).

Note

Debian-like users (such as Ubuntu ones) can use tools like alien to convert rpm binary packages to deb binary archives that you can later install with the dpkg tool.

2.1.3.2. Installing Youpi

Once you get the source code of Youpi, uncompress the tarball in the directory of your choice, which may be your home directory. Let’s use your $HOME environement variable as the base directory for installation:

$ cd
$ tar xvjf youpi-x.x.tar.gz

Warning

For security reasons, decompressing the tarball in a directory under your Apache’s DocumentRoot is not recommanded. Further information if available in the official Django’s documentation.

A youpi directory has just been created in your home dir. Let’s set $YOUPI_INSTALL_DIR to point to your newly created youpi directory. We will use this variable later to refer to your Youpi’s installation base directory:

$ export YOUPI_INSTALL_DIR=$HOME/youpi

2.1.4. Configuration

Now that everything is installed, a little more work is needed to get the things up and running. You are advise to follow the configuration order suggested in the following sections since they depend on each other.

2.1.4.1. MySQL

First, you have to configure your MySQL server to allow using a dedicated database for Youpi. Connect to the server with a privileged user account and issue the following commands:

mysql> grant all privileges on youpi.* to youpiuser@'%' identified by 'secret';
mysql> flush privileges;
mysql> create database youpi;

All access privileges are granted to the youpi database and you can access it with the youpiuser login and secret password. The youpiuser@'%' part means you can access your database from any host, which may not be what you want. To improve security, you may consider specifying a more restrictive rule. You might want to check the MySQL Reference Manual for further details.

You may want to check if you can connect to your newly created database from MySQL command line client:

$ mysql -u youpiuser -p youpi
Enter password: ****
mysql>

2.1.4.2. Condor

Your Condor installation and configuration depends on your Youpi’s installation kind. Configuring Condor itself is outside the scope of this document, but you can browse the official Condor documentation instead.

The most important requirement for Youpi to work properly is to be able to submit jobs from the machine hosting the Youpi installation. The condor_q command must be in your $PATH.

2.1.4.3. Django Configuration

Now that your database is properly configured, you have to setup Youpi’s Django installation. All the configuration options are available in the settings.py-dist file distributed with Youpi. Most parts of this file will remain untouched but some variables have to be setup properly in order to match your installation environment:

$ cd $YOUPI_INSTALL_DIR/terapix
$ cp settings.py-dist settings.py

You now have to tailor your settings.py‘s contents with appropriate values for all lines marked with a #FIXME comment. Here is the list of important variables you need to check:

HAS_CONVERT
Defaults to True. Youpi will try to generate thumbnails with the convert tool available in the ImageMagick software package. You can set it to False if you don’t plan to use this feature.
CMD_CONVERT
If HAS_CONVERT is set to True, fill in the full path to the convert program.
CMD_STIFF
Fill in the full path to the stiff.
CMD_IMCOPY
Fill in the full path to the imcopy program (parts of the CFITSIO package).
CMD_CONDOR_TRANSFER
Path to the provided condor_transfer.pl script used during job processings. This file is executed on the processing machines (Condor nodes). If you are performing a standalone installation, set CMD_CONDOR_TRANSFER to $YOUPI_INSTALL_DIR/tools/condor_transfer.pl since everything is installed locally. In case of a cluster installation, the condor_transfer .pl must be installed on all cluster nodes accepting jobs. Put this script in the same directory on all nodes and update the CMD_CONDOR_TRANSFER variable accordingly.
CONDOR_TRANSFER_OPTIONS

Any options you might want to set on the CMD_CONDOR_TRANSFER command line. For more information, see the man page:

$ $YOUPI_INSTALL_DIR/tools/condor_transfer.pl --man
CMD_FITSVERIFY
Path to the fitsverify command (used during images ingestion)
PROCESSING_OUTPUT
Path to the directory were all your processing data results are going to be stored. Youpi uses a per-user per-application storage policy. Thus, John’s Scamp user results will be stored in PROCESSING_OUTPUT/john/scamp/. Of course, this directory can point to any remote networked path (NFS for example).
FTP_URL

URL for FTP transfer. To maximize performance during network transfers, all results data are transferred back to the PROCESSING_OUTPUT directory using the FTP protocol with the condor_transfer.pl tool distributed with Youpi. FTP_URL should only contain the name of the target host (running a FTP server) that will receive the processing output data. For example, if your PROCESSING_OUTPUT directory is located on your data.example.org host, just set your FTP_URL this way:

FTP_URL = 'ftp://data.example.org/'
WWW_FITSIN_PREFIX

Must point to your web server responsible for serving your HTML results over HTTP. For example, all QualityFITS results will belong to a PROCESSING_OUTPUT/USER/fitsin/OUTPUT_DIRECTORY/IMAGE_NAME/qualityFITS/ directory. PROCESSING_OUTPUT is the settings.py parameter you just configured while OUTPUT_DIRECTORY and IMAGE_NAME are variables used by the QualityFITS plugin. So, in order to serve those results and make them available fro, Youpi in your web browser, you will have to configure your web server so that it servers content from the PROCESSING_OUTPUT directory. Let’s say you want to serve all Youpi results stored in /var/youpi/results/ (your PROCESSING_OUTPUT directory) from your data.example.org host with your Apache web server listening on port 9000. Then you might want to add this entry in your Apache configuration file:

Listen 9000
<VirtualHost *:9000>
        DocumentRoot "/var/youpi/results/"
        <Directory "/var/youpi/results/">
                Options Indexes MultiViews FollowSymLinks
                AllowOverride None
                Order allow,deny
                Allow from all
        </Directory>
</VirtualHost>

The same applies for WWW_FITSIN_SCAMP, WWW_FITSIN_SEX and WWW_FITSIN_SWARP. As written in the default settings.py-dist file, they have the same value as WWW_FITSIN_PREFIX, which is the default behaviour since they share the same PROCESSING_OUTPUT directory.

FILE_BROWSER_ROOT_DATA_PATH
Youpi allows you to select some local or remote (NFS mounted) data path using a file browser widget. In order to display directory contents properly you need to set this variable to the root data path you plan to use for accessing your data. It can be any directory or mount point. For example if your NFS shares are mounted under /mnt/nfs, use this value as the root path to your data. Feel free to change the FILE_BROWSER_ROOT_TITLE if the default value doesn’t suit your needs.
INGESTION_HOST_PATTERN
Regular expression used at ingestion step to determine the target hostname (a cluster node) that will be used as a Condor requirement target machine. You must use only one set of parenthesis; Youpi will try to search a match in the complete path to your input images directory. Note that the FILE_BROWSER_ROOT_DATA_PATH is substracted from the current path to the data before applying your host pattern.
INGESTION_HOST_MAPPING

Whenever INGESTION_HOST_PATTERN matches your path-to-images directory, a real hostname will be searched in the INGESTION_HOST_MAPPING dictionary. If the precedent pattern matches a key in this dictionary, the corresponding value will be used as a Condor target hostname requirement. For example, if your path to input images is something like /path/to/root/directory/host5/testing/megacam/, then you could set the following variables in your local_conf.py:

INGESTION_HOST_PATTERN  = r'^/(.*?)/.*$'
INGESTION_HOST_MAPPING = {'host5': 'host5.mydomain.org'}
# Keys can also be declared as regexps
#INGESTION_HOST_MAPPING = {r'host.*': '\1.mydomain.org'}

Since host5 will match, the corresponding value host5.mydomain.org will be used. You can also use regular expressions. In this case, \1 matches the current hostname value matched.

INGESTION_DEFAULT_HOST
Hostname used as a Condor requirement target machine if no match is found at all.
INGESTION_MAIL_FROM
Email address used for mail notifications when ingesting images.
CONDOR_NOTIFY_USER
Any valid email address where Condor is going to send job processing errors, if any.
DATABASE_NAME
The name of the MySQL database you want to use with Youpi. The value must match the one you used in the MySQL section. Also, you will have to set the DATABASE_USER, DATABASE_PASSWORD, DATABASE_HOST and DATABASE_PORT variables appropriately.
ADMINS
A Python tuple of tuples for Youpi administrators. Put your name and email to receive notifications when a problem occurs.
TRUNK

This is the path of your Youpi installation, put the value of your $YOUPI_INSTALL_DIR here:

TRUNK = '/home/user/youpi'
COMPRESS_YUI_BINARY

New in version 0.6.1.

The command line to use Yahoo’s YUI Java compressor to minify static CSS and Javascript files. Please note that this parameter is not required but is highly recommanded if you want to increase the application’s performance in production.

That’s it. You are done with the basic configuration variables. The remaining of the settings.py file can be left untouched.

Instead of heavily modifying the settings.py file directly, it is also possible to overwrite its variables using a separate local_conf.py file, which will be imported automatically by Django when processing settings.py. This is a good place to set your database configuration settings for example. A tipically local_conf.py file may look like this:

CMD_CONDOR_TRANSFER = '/usr/local/bin/condor_transfert.pl'
CMD_FITSVERIFY          = 'fitsverify'
# Testing database (do not use for production)
DATABASE_NAME       = 'youpitest'
DATABASE_USER       = 'debug'
DATABASE_PASSWORD   = 'debug'
DATABASE_HOST       = 'db.example.org'
#
FILE_BROWSER_HEADER_TITLE = 'Cluster Path Browser'
FILE_BROWSER_ROOT_TITLE = 'Network Filesystem'
FILE_BROWSER_ROOT_DATA_PATH = '/mnt/nfs'
# Email sent when a job fails
CONDOR_NOTIFY_USER = 'monnerville@iap.fr'
INGESTION_HOST_PATTERN = r'^/(.*?)/.*$'
INGESTION_HOSTS_MAPPING = {r'.*ix.*': '\1.domain.org'}

Warning

Some configuration variables must not be overwritten in a local_conf.py file since other variables defined in settings.py may depend on their definition. All variables can be defined in local_conf.py except for TRUNK, CMD_CONVERT, CONVERT_THUMB_OPT, FTP_URL, PROCESSING_OUTPUT and WWW_FITSIN_PREFIX.

Now, give ownership to your web server’s user:

$ chown -R www-data:www-data $YOUPI_INSTALL_DIR

In your case, this may not be www-data but apache or www depending on your Linux distribution. Just look at the entries in your /etc/password and /etc/group system files.

2.1.4.4. Initial Setup Checks

Now it’s time to run first-time installation checks involving creating the database structure, then populating it with initial data and creating a database admin account. Most of Youpi management is done from the $HOME/youpi/terapix directory using the manage.py command line script.

In order to create the database structure, ask the Django manager to issue the syncdb command:

$ cd $YOUPI_INSTALL_DIR/terapix
$ python manage.py syncdb

This will create all the database tables required by Youpi. It will also ask for a login, a valid email address and a password in order to create an active admin account. Only active accounts are allowed to authenticate and enter the Youpi application. Populating the database with initial data and running some safety checks involves the checksetup command:

$ python manage.py checksetup

Note

This command is non-destructive: you can use it whenever you want. It will only check that everything in the database is setup properly and will not alter or damage your data in any way.

2.1.4.5. Apache

Deploying Youpi into production using Apache is easy since Youpi comes with handy tools to help you get ready easily. From the $YOUPI_INSTALL_DIR/terapix directory, just issue the following command:

$ python manage.py checksetup --wsgi --apache

The --wsgi option will create a $YOUPI_INSTALL_DIR/deploy/django.wsgi configuration file ready to use - with values matching your current Youpi installation - with any web server supporting the WSGI (Web Server Gateway Interface) specification.

The --apache option will generate a $YOUPI_INSTALL_DIR/deploy/youpi.conf configuration file ready to use with your Apache web server. You can either copy this file to your web server’s configuration files directory or copy its contents to your httpd.conf file.

Now just reload Apache before testing your installation:

$ sudo /etc/init.d/apache2 reload

Congratulations! The configuration step is over :)

2.1.5. Testing Your Installation

At this point, you are ready to run Youpi. Just point your browser to the web server running Youpi, say http://myhost/youpi/. The login page should appear. If not, you may start checking your server’s logs and reread all the configuration steps in this chapter.

2.1.6. Improving performance in a production environment

New in version 0.6.1.

This version comes with new features regarding performance improvement when Youpi is deployed on a production server (See the ChangeLog). Before that, a lot of HTTP requests were made every time a page was loaded. Many pages required that the browser issued more than 30(!) HTTP requests in order to download all the required content to render the page!

Even if browser caching is enabled (default), those 30+ HTTP queries will continue to be sent to the server(!) on every page reload. The browser will then receive a “304 Not Modified” HTTP response for all cached content.

Big performance improvements can be made with the following techniques:

Make fewer HTTP requests

It’s possible by combining all Javascript scripts used on the page into a single JS file. The same applies for CSS stylesheets. Moreover, with minified versions, page loading can be very fast. Youpi includes tools to generate static files combination and minification in a fully automatic way.

In order to generate those files, make sure you have set the COMPRESS_YUI_BINARY variable correctly (as explained in the Release Notes) in your settings.py (or local_conf.py) file and that the COMPRESS variable is undefined or set to True. If everything looks good, you can generate the files from the command line:

$ python manage.py synccompress --force

Those newly generated static files will be used automatically by Django when rendering templates.

Add Expires headers

A first-time visit to a page may require several HTTP requests to load all the components. By using Expires headers these components become cacheable, which avoids unnecessary HTTP requests on subsequent page views. Assuming that Apache’s mod_expires module is available (which is generally the case) and that the directory holding Youpi’s static files is /var/www/youpi/terapix/media, you can add the following directives to your Apache configuration file:

<Directory "/var/www/youpi/terapix/media">
        # "Far future expire" technique
        ExpiresActive on
        ExpiresDefault "access plus 1 week"
</Directory>

Note

You may want to set up a different ExpiresDefault value, but one week before cache expiration should be just fine.

Compress static components with gzip

Compression reduces response times by reducing the size of the HTTP response. Apache’s mod_deflate compression generally reduces the response size by about 70%! Assuming that Apache’s mod_deflate module is available (which is generally the case) and that the directory holding Youpi’s static files is /var/www/youpi/terapix/media, you can add the following directives to your Apache configuration file:

<Directory "/var/www/youpi/terapix/media">
        # Enables on-the-fly compression
        SetOutputFilter DEFLATE
        BrowserMatch ^Mozilla/4 gzip-only-text/html
        BrowserMatch ^Mozilla/4\.0[678] no-gzip
        BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
        SetEnvIfNoCase Request_URI \
        \.(?:gif|jpe?g|png|swf|pdf|t?gz|zip|bz2|rar)$ no-gzip dont-vary
</Directory>

Note

Note that Django templates (dynamic content) are already gzip’ed in Youpi using the django.middleware.gzip.GZipMiddleware middleware.

2.1.7. Upgrading to New Releases

Before upgrading to a newer version of Youpi, please ensure you have a recent backup of your database data. Then read the release notes in the ChangeLog section carefully.