You are here: Home Articles Rolling out Repoze
Navigation
OpenID Log in

 

Rolling out Repoze

by Martin Aspeli last modified May 23, 2008 09:14 AM

Back to the future

In case you didn't know, Repoze is a refactoring of Zope 2 that makes it play nice in today's WSGI-integrated, egg-packaged, paster-served world. I have previously predicted that it will be part of the future deployment- and configuration story for Plone, and those who've tried it seem to really like it.

To get more familiar with Repoze and its associated toolchain, I've moved two things to the Repoze platform. One is this blog, which runs a fairly standard Repoze deployment of Plone 3.1.1. The other is a site that has an "admin" interface based on standard Plone, and a simple Deliverance-themed front end for anonymous visitors.

Although the Repoze authors tend to advocate just easy_install'ing Repoze into a virtualenv, we tend to use Buildout in the Plone world. Repoze does have a standard buildout for Plone, albeit one with a few rough edges.

I'll discuss my setup below, but first the big picture:

  • Repoze is almost ready for prime time! There are a few rough edges to be smoothened out, and a few packaging issues to fix up (mainly due to Zope and Plone not supporting full deployment as Python eggs, which means that the Repoze guys have to re-package both), but the infrastructure works.
  • Repoze makes a lot of sense! It's generally very pleasant to work with and opens up new possibilities for working with Plone and other systems.
  • Deployment with mod_wsgi is interesting, but probably needs a bit more testing and experimentation. More on that in a bit.

A standard instance

My first Repoze setup, the one which runs this blog, is a pretty standard buildout. The buildout.cfg file looks like this:
[buildout]
parts = 
    site    
    instance

index = http://dist.repoze.org/plone/3.1.1/simple

develop =

[site]
recipe = repoze.recipe.egg
eggs = repoze.plone>=3.1.1

[instance]
recipe = iw.recipe.cmd
on_install = true
cmds = 
   bin/mkzope2instance --zope-port=8080
   echo "Please run 'bin/addzope2user <username> <password>'"

This simply pulls in the repoze.plone egg, using the index of packages built for Plone 3.1.1 by the Repoze guys, which also has download links to the various dependencies. We use repoze.recipe.egg, an extension of the more common zc.recipe.egg that comes with buildout, in order to ensure that all console scripts (including those in dependent packages) are installed.

The [instance] section simply runs the mkzope2instance script that comes with repoze.zope2 (a dependency of repoze.plone), which creates directories like Products and var and generates suitable zope.conf and other configuration files. This buildout deviates a bit from the standard Repoze one, which sets up ZEO by  and runs that under supervisord - more on this below.

I had to fix up the generated files a little, mainly to hardcode the paths to the Zope instance. Note that unlike the standard Plone buildouts, which generate a zope.conf with an INSTANCE_HOME in the parts directory, the Zope INSTANCE_HOME in the repoze setup is actually the buildout root itself.

At runtime, Zope is configured by Paste, via the following configuration file in etc/zope.ini:

[DEFAULT]
debug = True

[app:zope2]
paste.app_factory = repoze.obob.publisher:make_obob
repoze.obob.get_root = repoze.zope2.z2bob:get_root
repoze.obob.initializer = repoze.zope2.z2bob:initialize
repoze.obob.helper_factory = repoze.zope2.z2bob:Zope2ObobHelper
zope.conf = %(here)s/zope.conf

[filter:errorlog]
use = egg:repoze.errorlog#errorlog
path = /__error_log__
keep = 20
ignore = paste.httpexceptions:HTTPUnauthorized
       paste.httpexceptions:HTTPNotFound
       paste.httpexceptions:HTTPFound

[pipeline:main]
pipeline = egg:Paste#cgitb
           egg:Paste#httpexceptions
           egg:repoze.retry#retry
           egg:repoze.tm#tm
           egg:repoze.vhm#vhm_xheaders 
           errorlog
           zope2

[server:main]
use = egg:repoze.zope2#zserver
host = 127.0.0.1
port = 8080


Error handling, retry, transactions and virtual hosting are all pushed to middleware, and Zope is served as a WSGI application.

To start the application server, you could do:

$ ./bin/paster serve etc/zope2.ini

In fact, I run this via supervisor - more on that below.

To serve this up, I've got Varnish running on a particular IP address, which the domain name for this site resolves to. CacheFu is installed manually, by unpacking it to the top-level Products/ directory. I could've used a buildout recipe to do this, I was just lazy.

Varnish passes requests to Apache, which is used to host lots of different things on one server. The virtual host configuration for Repoze looks like this:

VirtualHost 89.250.115.180:80>
    ServerName martinaspeli.net 
    ServerAlias www.martinaspeli.net
    ServerAlias blog.martinaspeli.net 

    RewriteEngine on
    RewriteRule     ^(.*)   http://localhost:8080/VirtualHostBase/http/%{SERVER_NAME}:80/blog/VirtualHostRoot$1 [L,P]

</VirtualHost>


This is no different from a pre-Repoze setup.

Things to improve

There are a few things I think we could improve with this setup:

  • Plone should support a standard distribution consisting only of eggs, so that Repoze can stop re-packaging Plone. Zope will need to do the same, but Zope changes less frequently.
  • There should be a buildout recipe to set up the Zope instance and properly generate the configuration files so that they require no editing. This should be based on the options supported by plone.recipe.zope2instance right now.
  • We could make a few layout tweaks to make the Repoze-based buildout and the pre-Repoze buildout produce more similar outputs. This will mean existing documentation is more likely to stay relevant and useful.

A complex setup with Deliverance and mod_wsgi

The configuration above shows how to use Repoze to build something that'll be fairly familiar to users of Zope today, but which makes it easier to configuration applications using WSGI. For my next experiment, a "photo blog" a bit like blipfoto.com without the annoying date limitations, I decided to go pretty far in the other direction. This setup involves:

  • A site on "domain.com", which is styled using Deliverance. This is very limited - it shows a page title, some free text, a picture, and a calendar.
  • A standard Plone site on "admin.domain.com". This uses some simple content types to let the (single) user create "photo blog posts" and show them in an album. It also uses plone.portlet.contentcalendar to display photo blog posts in a calendar.
  • The Deliverance theme and style sheet are served up from "static.domain.com". This just uses Apache. I'd actually like to get rid of this, and have Deliverance read resources from the filesystem. Apparently, this is possible with Deliverance trunk, which can "mount" a directory on a virtual URL like "/.deliverance/" and translate relative "file:///" URLs to this setup, but I haven't been able to try it out yet.

The buildout for this setup looks like follows:

[buildout]
parts = 
    blog
    instance

index = 
    http://dist.repoze.org/simple/

find-links =
    http://dist.repoze.org/
    http://dist.plone.org
    http://download.zope.org/ppix/
    http://effbot.org/downloads

develop =
    src/photoblog.policy
    src/plone.portlet.contentcalendar
    src/collective.photoblog

[blog]
recipe = repoze.recipe.egg
eggs =
    photoblog.policy
interpreter = blogpy

[instance]
recipe = iw.recipe.cmd
on_install = true
cmds = 
   bin/mkzope2instance --use-zeo --zeo-port=8881 --zope-port=8880
   echo "Please run 'bin/supervisord', then 'bin/addzope2user <username> <password>'"
   echo "Control Zope and ZEO via 'bin/supervisorctl' subsequently."

 

In this case, I'm using a product "photoblog.policy" to be "the application". This has the following in its setup.py:

      install_requires=[
          'setuptools',
          # -*- Extra requirements: -*-
          'repoze.plone',
          'Deliverance',
          'plone.portlet.contentcalendar',
          'collective.photoblog',
      ],

Thus, it declares "Plone" as a dependency (it should probably have specified a version, too), which gives us Zope and all of Repoze's standard packages. Deliverance, the content calendar portlet and a package that contains the photo blog content types and views are included as well. The package also declares an extension profile which installs these things and configures the site a little.

The [instance] section of the buildout again uses mkzope2instance from Repoze - this time with ZEO - and again I had to edit zope.conf and zeo.conf to set absolute paths for the INSTANCE_HOME. See the section on supervisor below for details on how I run ZEO.

The Paste configuration is a lot more interesting here. I've put it in blog.ini, and it looks like this:

[DEFAULT]
debug = True

#
# Servers
#

[server:main]
# use = egg:repoze.zope2#zserver
# use = egg:PasteScript#wsgiutils
use = egg:PasteScript#cherrypy
host = 127.0.0.1
port = 8001

#
# Composite - defines the main URL mappings
#

[composite:main]
use = egg:Paste#urlmap
/static = static
/admin = zope
/ = blog

#
# Main applications
#

[app:zope2.app]
paste.app_factory = repoze.obob.publisher:make_obob
repoze.obob.get_root = repoze.zope2.z2bob:get_root
repoze.obob.initializer = repoze.zope2.z2bob:initialize
repoze.obob.helper_factory = repoze.zope2.z2bob:Zope2ObobHelper
zope.conf = %(here)s/etc/zope.conf

[app:static.app]
use = egg:Paste#static
document_root = %(here)s/static

#
# Pipelines
#

[pipeline:blog]
pipeline = egg:Paste#cgitb
           egg:Paste#httpexceptions
           deliverance
           egg:repoze.retry#retry
           egg:repoze.tm#tm 
           egg:repoze.vhm#vhm_xheaders
           errorlog
           zope2.app

[pipeline:static]
pipeline = egg:Paste#cgitb
           egg:Paste#httpexceptions
           errorlog
           static.app
 
[pipeline:zope]
pipeline = egg:Paste#cgitb
           egg:Paste#httpexceptions
           egg:repoze.retry#retry
           egg:repoze.tm#tm
           egg:repoze.vhm#vhm_xheaders 
           errorlog
           zope2.app
           
#
# Filters used in pipelines
#

[filter:deliverance]
use = egg:Deliverance
theme_uri = http://static.domain.com/theme.html
rule_uri = file://%(here)s/etc/rules.xml

[filter:profile]
use = egg:repoze.profile#profile
log_filename = myapp.profile
discard_first_request = true
path = /__profile__
flush_at_shutdown = true

[filter:errorlog]
use = egg:repoze.errorlog#errorlog
path = /__error_log__
keep = 20
ignore = paste.httpexceptions:HTTPUnauthorized
         paste.httpexceptions:HTTPNotFound
         paste.httpexceptions:HTTPFound

Let's consider the various parts in turn

  • There is a "main" server that uses the CherryPy WSGI server, mainly because tests indicate it's fast. There is a corresponding "composite" configuration that mounts the "zope" application on /admin and the "blog" application on / (the root). Static content is served out of the folder "static" in the same directory, mounted on /static. These are used when the application is run under "paster serve" (as above) and means that you can view the Deliverance-themed application on localhost:8001, and see standard Zope on localhost:8001/admin.
  • There is a pipeline application defined for each of "blog", "static" and "zope". Importantly, the "blog" pipeline includes the "deliverance" filter.
  • The "deliverance" filter is defined below, and specifies to URL to serve the theme file from and the location of the rules file. In this case, we serve this out of Apache. 

During local development, I served Deliverance content from localhost:8001/static. For deployment, am serving the "static" directory with Apache using a virtual host for "static.domain.com". The Apache configuration for this is simple:

<VirtualHost 89.250.115.180:80>
    ServerName static.domain.com 
    DocumentRoot /home/optilude/sites/photoblog/static
    <Location />
	Order Allow,Deny
	Allow from all
   </Location>
</VirtualHost>

The other two virtual hosts are more interesting. Here, I'm using mod_wsgi to serve the application up within Apache - no long-running Zope process (ZEO client) at all!

WSGIDaemonProcess photoblog threads=10 processes=1 maximum-requests=10000 python-path=/usr/lib/python2.4/site-packages

<VirtualHost 89.250.115.180:80>
    ServerName domain.com 
    ServerAlias www.domain.com
    WSGIScriptAlias / /home/optilude/sites/photoblog/bin/zope.wsgi
    WSGIProcessGroup photoblog 
    WSGIApplicationGroup %{GLOBAL}
    WSGIPassAuthorization On
    SetEnv DISABLE_PTS 1
    SetEnv HTTP_X_VHM_HOST http://domain.com
    SetEnv HTTP_X_VHM_ROOT /blog
    SetEnv APP_NAME blog
</VirtualHost>

<VirtualHost 89.250.115.180:80>
    ServerName admin.domain.com 
    WSGIScriptAlias / /home/optilude/sites/photoblog/bin/zope.wsgi
    WSGIProcessGroup photoblog 
    WSGIApplicationGroup %{GLOBAL}
    WSGIPassAuthorization On
    SetEnv DISABLE_PTS 1
    SetEnv HTTP_X_VHM_HOST http://admin.domain.com
    SetEnv HTTP_X_VHM_ROOT /blog
    SetEnv APP_NAME zope
</VirtualHost>

Note the nice approach to virtual hosting. I could never remember the VirtualHostRoot stuff in Zope 2.

Note - the code example above has been corrected since this article was first published. I was using the wrong mod_wsgi configuration before, leading to performance problems.

To use mod_wsgi, you need to create a script that it can load as a WSGI application. mod_wsgi actually supports virtualenv directly, letting you specify a root python-path. With buildout, this doesn't quite work - there is no directory containing all installed eggs. Instead, the working set is calculated by various recipes and when console scripts are generated in the bin/ directory, they are prepended with an explicit sys.path.

The mkzope2instance script that comes with Repoze will generate a zope.wsgi script that can be used by mod_wsgi, but it is not a console script, and so buildout never has a chance to set up its sys.path. I therefore had to do this manually (by coping the output from another script). I also amended the script to take the name of the application to serve (i.e. the name of an "app" or "pipeline" in the blog.ini file above) as a WSGI environment parameter. The resulting script looks like this:

# XXX: This is manually constructed!
import sys

sys.path[0:0] = [
  '/home/optilude/sites/photoblog/src/photoblog.policy',
  # other lines removed to save space
  ]

import os
from paste.deploy import loadapp

ini = '/home/optilude/sites/photoblog/blog.ini'

def application(environ, start_response):
    app_name = environ.get('APP_NAME', 'main')
    paste_app = loadapp('config:%s' % ini, name=app_name)
    return paste_app(environ, start_response)


In the mod_wsgi configuration above, there will be a daemon process waiting to service requests, capable of handling 10 requests at a time (since it has ten threads). It is possible to have multiple daemon processes, which will change the performance characteristics.

When a daemon process is first brought into action, it will start up all of Zope. This can take anywhere from 10 seconds to a couple of minutes depending on what you have installed and whether things like debug mode are on. Once all daemon processes are up and running, the site should be no slower than a standard Plone setup, but the first few requests can appear very slow.

Things to improve

This setup is a lot more bespoke than the one described above. There are a lot of moving parts, so I think it may take some time to arrive at "best practice", but here are some things I think would be worth looking at:

  • The buildout recipe that creates the instance as envisaged above should be flexible enough to support bespoke setups like. At least, it shouldn't step on anyone's toes.
  • I'd rather not depend on a subdomain to serve up the static Deliverance resources. The "/.deliverance/" mount point for filesystem resources that Ian says is now working in Deliverance trunk will probably mean there's no need for this.
  • We need a simple buildout recipe to generate the WSGI script with a proper sys.path. This should be pretty simple.
  • I'm not sure of the best way to use CacheFu and Varnish in this setup. As far as Plone (and thus CacheFu) is concerned, there's only one Plone instance, on /blog in the ZMI, but it gets served up in very different ways. On admin.domain.com, it's more or less a standard Plone site, with all the associated graphics and styles from the NuPlone skin. On www.domain.com, only a single image and the HTML of the calendar are being used, the rest is basically static content.

Supervisor

Chris McDonough's Supervisor has definitely been the hidden gem of my Repoze experiments. It is a Unix process management tool that makes it easy to start, stop, log and monitor processes, with an easy-to-use command line client and even a web GUI.

The standard Repoze buildouts use supervisor to manage the ZEO server and the "paster serve" process. I've decided to centralise my supervisor setup so that I can use one supervisor process to control many different servers. Perhaps overkill, but I find it quite useful.

To do this, I've got a simple buildout that just sets up supervior:

[buildout]
parts = 
    supervisor

index = 
    http://dist.repoze.org/simple/

[supervisor]
recipe = repoze.recipe.egg
eggs = supervisor


With this, I get a ./bin/supervisord (the daemon) and ./bin/supervisorctl (the client control program). I start the former with an init.d script like this in Ubuntu:

#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DESC="Supervisor"
NAME=supervisor
DAEMON=/home/optilude/sites/shared/bin/supervisord
CTL=/home/optilude/sites/shared/bin/supervisorctl
SCRIPTNAME=/etc/init.d/$NAME
OPTIONS="-c /home/optilude/sites/shared/etc/supervisord.conf"

# Gracefully exit if the package has been removed.
test -x $DAEMON || exit 0

case "$1" in
    start)
        cd /
        echo -n "Starting $DESC: $NAME"
        $DAEMON $OPTIONS 2>&1 >/dev/null < /dev/null
        echo "."
        ;;
  stop)
        echo -n "Stopping $DESC: $NAME"
        $CTL $OPTIONS shutdown 
        echo "."
        ;;
  *)
        echo "Usage: $SCRIPTNAME {start|stop}" >&2
        exit 3
        ;;
esac

exit 0


The supervisor configuration file is in etc/supervisord.conf and looks like this:

[inet_http_server]
port=127.0.0.1:9030

[supervisord]
logfile=%(here)s/../log/supervisord.log
logfile_maxbytes=50MB
logfile_backups=10 
loglevel=info
pidfile=%(here)s/../var/supervisord.pid
nodaemon=false

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=http://127.0.0.1:9030

[include]
files = /home/optilude/sites/supervisor.d/*.conf


This basically just wires up the server and its log and looks for programs to manage in /home/optilude/sites/supervisor.d/*.conf. I have two files there currently. The one for this blog looks like this:

[program:personal]
command = /home/optilude/sites/personal/bin/paster serve /home/optilude/sites/personal/etc/zope2.ini
redirect_stderr = true


This just runs the paster process. I'm not using ZEO here. The one for the photo blog site runs ZEO (since the web server runs under mod_wsgi in Apache):

[program:photoblog]
command = /home/optilude/sites/photoblog/bin/runzeo -C /home/optilude/sites/photoblog/etc/zeo.conf
redirect_stderr = true

 

With these running, I can run commands like "supervisorctl status personal" to see the status of the photo blog or "supervisorctl tail -f photoblog" to watch the console output of the photo blog ZEO process.

Supervisor has a ton of features, such as the ability to automatically restart processes if they die or if they exceed a set memory limit. You can easily use it with standard Zope setups (use runzope rather than zopectl in this case) and any other application you want to control, such as an external database.

Document Actions

Buildout for Plone 3 with deliverance on WSGI

Posted by http://sargo.openid.pl/ at Jul 12, 2009 01:07 PM
Good tutorial - very helpful!

I made buildout quite similar to solution described in this blog post. It uses Plone 3.3rc4, deliverance 0.3 and mod_wsgi for production. In development it also includes useful addons (like: stxnext.pdb, or egg:Paste#evalerror).

http://lichota.pl/[…]/buildout-for-plone-3-with-deliverance-on-wsgi

Why install supervisor on a per buildout basis?

Posted by https://launchpad.net/~lcc-lazyweb at May 01, 2010 11:21 PM
Why not install it systemwide, once, and just build the configuration file in the buildout?
Plone Book
Professional Plone 4 Development

I am the author of a book called Professional Plone Development. You can read more about it here.

About this site

This Plone site is kindly hosted by: 

Six Feet Up