[systemd-devel] preparing unit definition for squid – some questions

Thu Nov 25 15:31:55 PST 2010

On Thu, Nov 25, 2010 at 7:43 PM, Tomasz Torcz <tomek at pipebreaker.pl> wrote:
> Hi,
>
>  I'm slowly porting init scripts of software I use to systemd unit files.
> It is generally straight-forward, but today Squid made me wonder.
>
>  Squid is web-caching daemon (proxy) with some pecularities.  It
> stores the cache of the internet in some directories on disk.  To
> paralleize access, it is recommended to define separate cache_dir for
> each hard disk. Also, squid likes 2-level deep folder structure
> in each cache_dir, to implement simple hashing and deal with lesser
> filesystems.
>
>  Populating each defined cache_dir can be slow operation, therefore
> it is not done automatically.  Fedora's squid script checks if cache_dir
> is empty and if so, run the command to create folder hierarchy.
>
>  This “populate” command is great candidate for separating into its own
> unit, squid-populate.service, required by main squid.service and
> using negated ConditionDirectoryNotEmpty=.
>
>  As I mentioned, there could be more than one cache_dir definition.
> For each one, separate squid-populate.service should be created, using
> @ instances. (BTW, we cannot grep-out all the directories from
> squid.conf, administrator needs to repeat cache_dir configuration in
> systemd units).
>
>  Now the question:  how should requires be defined?  Which is better way?
> 1) should squid.service explicitly require squid-populate@/var/spool/squid/cache1.service,
>   …cache2.service etc?  This means that administrator would need to
>  modify main squid.service when adding more cache_dirs
>
> or
>
> 2) should each instance define itself as required by squid.service?  If so,
>   how to do it?
>   This behaviour makes adding directories simpler.
>
>  Nb. I consider using unit generator parsing squid.conf an overkill for
> this task :)

IMHO this is totally wrong, we're trying to solve the problem
elsewhere. Squid itself should check and generate the missing
directories as it will already have to load the configuration file and
(probably) the cache directory structure. This is even more true that
it shouldn't assume all directories exist if at least one is there
(ie: it is not empty, but not all required are there).

There is also no parallelization gains to have it as another unit, as
squid can't work until folders are there.

>  Second issue, cascading timeouts.  Let's say squid-populate.service has TimeoutSec=3m,
> because it may be slow.  Main squid.service has standard 1m timeout.
> What happen when I run "systemctl start squid.service" with empty cache
> directory? squid-populate.service will be started and may work for, let's
> say, 2 minutes.  Will systemctl timeout after 1m?  Or after 3m+1m?
>
>  In future squid may be socket-activable.  If population is needed, will
> the timeout for connection be 3m (populate) + 1m (squid itself) + 1m (socket
> timeout) = 5m ?

socket-activate squid is something quite non-sense to do. One usually
runs squid on very specific purpose machines and in that case squid
should be started as soon as the system boots to start serving
requests... as you already said, it may take couple of minutes and
that would be unacceptable for socket activation.

>  Side observation: systemd has only one timeout for start and stop.
> Squid's sysv script defines short timeout for start and much longer
> for stop (because squid waits for open connection to end when closing).

this is a good point.

-- 
Gustavo Sverzut Barbieri
http://profusion.mobi embedded systems
--------------------------------------
MSN: barbieri at gmail.com
Skype: gsbarbieri
Mobile: +55 (19) 9225-2202