doc: Use `xset content-type' for the dirplex sendfile example.
[ashd.git] / doc / dirplex.doc
CommitLineData
e7e3e593
FT
1dirplex(1)
2==========
3
4NAME
5----
6dirplex - Physical directory handler for ashd(7)
7
8SYNOPSIS
9--------
10*dirplex* [*-hN*] [*-c* 'CONFIG'] 'DIR'
11
12DESCRIPTION
13-----------
14
15The *dirplex* handler maps URLs into physical files or directories,
16and, having found a matching file or directory, it performs various
17kinds of pattern-matching against its physical name to determine what
18handler to call in order to serve the request. The mapping procedure
19and pattern matching are described below.
20
21Having found a handler to serve a file or directory with, *dirplex*
22adds the `X-Ash-File` header to the request with a path to the
23physical file, before passing the request on to the handler.
24
25*dirplex* is a persistent handler, as defined in *ashd*(7).
26
27OPTIONS
28-------
29
30*-h*::
31
1406acb5 32 Print a brief help message to standard output and exit.
e7e3e593
FT
33
34*-N*::
35
36 Do not read the global configuration file `dirplex.rc`.
37
38*-c* 'CONFIG'::
39
40 Read an extra configuration file. If 'CONFIG' contains any
41 slashes, it is opened by that exact name. Otherwise, it is
42 searched for in the same way as the global configuration file
43 (see CONFIGURATION below).
44
45URL-TO-FILE MAPPING
46-------------------
47
48Mapping URLs into physical files is an iterative procedure, each step
49looking in one single physical directory, starting with 'DIR'. For
50each step, a path element is stripped off the beginning of the rest
51string and examined, the path element being either the leading part of
5ba4cb3a
FT
52the rest string up until (but not including) the first slash, or the
53entire rest string if it contains no slashes. If the rest string is
54empty, the directory being examined is considered the result of the
55mapping. Otherwise, any escape sequences in the path element under
56consideration are unescaped before examining it.
e7e3e593 57
b70b2d4f 58If the path element names a directory in the current directory, the
5ba4cb3a
FT
59procedure continues in that directory, unless there is nothing left of
60the rest string, in which case *dirplex* responds with a HTTP 301
61redirect to the same URL, but ending with a slash. Otherwise, the
62remaining rest string begins with a slash, which is stripped off
63before continuing. If the path element names a file, that file is
64considered the result of the mapping (even if the rest string has not
65been exhausted yet).
e7e3e593
FT
66
67If the path element does not name anything in the directory under
68consideration, but contains no dots, then the directory is searched
69for a file whose name before the first dot matches the path
70element. If there is such a file, it is considered the result of the
71mapping.
72
73If the result of the mapping procedure is a directory, it is checked
74for the presence of a filed named by the *index-file* configuration
75directive (see CONFIGURATION below). If there is such a file, it is
76considered the final result instead of the directory itself. If the
77index file name contains no dots and there is no exact match, then,
78again, the directory is searched for a file whose name before the
79first dot matches the index file name.
80
b70b2d4f
FT
81See also 404 RESPONSES below.
82
e7e3e593
FT
83CONFIGURATION
84-------------
85
86Configuration in *dirplex* comes from several sources. When *dirplex*
87starts, unless the *-N* option is given, it tries to find a global
88configuration file named `dirplex.rc`. It looks in all directories
9f974c1f
FT
89named by the *PATH* environment variable, appended with
90`../etc/ashd`. For example, then, if *PATH* is
91`/usr/local/bin:/bin:/usr/bin`, the directories `/usr/local/etc/ashd`,
92`/etc/ashd` and `/usr/etc/ashd` are searched for `dirplex.rc`, in that
93order. Only the first file found is used, should there exist several.
e7e3e593
FT
94
95If the *-c* option is given to *dirplex*, it too specifies a
96configuration file to load. If the name given contains any slashes, it
97is opened by that exact name. Otherwise, it is searched for in the
98same manner as the global configuration file.
99
100In addition, all directories traversed by *dirplex* when mapping a URL
101into a physical file may contain a file called `.htrc`, which may
102specify extra configuration options for all files in and beneath that
103directory.
104
105`.htrc` files are checked periodically and reread if changed. The
106global configuration file and any file named by the *-c* option,
107however, are never reexamined.
108
109When using the configuration files for deciding what to do with a
110found file, they are examined in order of their "distance" from that
111file. `.htrc` files found in the directory or directories containing
112the file are considered "closest" to the file under consideration,
113followed by any configuration file named by the *-c* option, followed
114by the global configuration file.
115
116Each configuration file is a sequence of configuration stanzas, each
117stanza being an unindented starting line, followed by zero or more
118indented follow-up lines adding options to the stanza. The starting
119line of a stanza is referred to as a "configuration directive"
120below. Each line is a sequence of whitespace-separated words. A word
121may contain whitespace if such whitespace is escaped, either by
122enclosing the word in double quotes, or by escaping individual
123whitespace characters with a preceding backslash. Backslash quoting
124may also be used to treat double quotes or another backslash literally
125as part of the word. Empty lines are ignored, and lines whose first
126character after leading whitespace is a hash character (`#`) are
127treated as comments and ignored.
128
129The follow configuration directives are recognized:
130
aa7e4406
FT
131*include* ['FILENAME'...]::
132
16c2bec3 133 Read the named files and act as if their contents stood in
aa7e4406
FT
134 place of the *include* stanza. A 'FILENAME' may be a glob
135 pattern, in which case all matching files are used, sorted by
136 their filenames. If a 'FILENAME' is a relative path, it is
137 treated relative to the directory containing the file from
138 which the *include* stanza was read, even if the inclusion has
139 been nested. Inclusions may be nested to any level.
140
e7e3e593
FT
141*index-file* ['FILENAME'...]::
142
143 The given 'FILENAMEs' are used for finding index files (see
144 URL-TO-FILE MAPPING above). Specifying *index-file* overrides
145 entirely any previous specification in a more distant
146 configuration file, rather than adding to it. Zero 'FILENAMEs'
147 may be given to turn off index file searching completely. The
148 *index-file* directive accepts no follow-up lines.
149
150*child* 'NAME'::
151
152 Declares a named, persistent request handler (see *ashd*(7)
153 for a more detailed description of persistent handlers). It
154 must contain exactly one follow-up line, *exec* 'PROGRAM'
155 ['ARGS'...], specifying the program to execute and the
156 arguments to pass it. If given in a `.htrc` file, the program
157 will be started in the same directory as the `.htrc` file
158 itself. The *child* stanza itself serves as the identity of
159 the forked process -- only one child process will be forked
160 per stanza, and if that child process exits, it will be
161 restarted the next time the stanza would be used. If a `.htrc`
162 file containing *child* stanzas is reloaded, any currently
163 running children are reused for *child* stanzas in the new
164 file with matching names (even if the *exec* line has
165 changed).
166
167*fchild* 'NAME'::
168
169 Declares a named, transient request handler (see *ashd*(7) for
16c2bec3 170 a more detailed description of transient handlers). It must
67223ca4 171 contain exactly one follow-up line, *exec* 'PROGRAM'
e7e3e593
FT
172 ['ARGS'...], specifying the program to execute and the
173 arguments to pass it. In addition to the specified arguments,
174 the HTTP method, raw URL and the rest string will be appended
9f974c1f
FT
175 as described in *ashd*(7). If given in a `.htrc` file, the
176 program will be started in the same directory as the `.htrc`
177 file itself.
e7e3e593
FT
178
179*match* [*directory*]::
180
181 Specifies a filename pattern-matching rule. The
182 pattern-matching procedure and the follow-up lines accepted by
183 this stanza are described below, under MATCHING.
184
185*capture* 'HANDLER'::
186
187 Only meaningful in `.htrc` files. If a *capture* directive is
188 specified, then the URL-to-file mapping procedure as described
189 above is aborted as soon as the directory containing the
190 `.htrc` file is encountered. The request is passed, with any
191 remaining rest string, to the specified 'HANDLER', which must
192 by a named request handler specified either in the same
193 `.htrc` file or elsewhere. The *capture* directive accepts no
16c2bec3
FT
194 follow-up lines. Note that the `X-Ash-File` header is not
195 added to requests passed via *capture* directives.
e7e3e593
FT
196
197MATCHING
198--------
199
200When a file or directory has been found by the mapping procedure (see
201URL-TO-FILE MAPPING above), the name of the physical file is examined
202to determine a request handler to pass the request to. Note that only
203the physical file name is ever considered; any logical request
204parameters such as the request URL or the rest string are entirely
205ignored.
206
207To match a file, any *match* stanzas specified by any `.htrc` file or
208in the global configuration files are searched in order of their
209"distance" (see CONFIGURATION above) from the actual file. If it is a
210directory which is being considered, only *match* stanzas with the
211*directory* parameter are considered; otherwise, if it is a file, only
212*match* stanzas without the *directory* parameter are considered.
213
214A *match* stanza must contain at least one follow-up line specifying
215match rules. All rules must match for the stanza as a whole to match.
216The following rules are recognized:
217
218*filename* 'PATTERN'...::
219
220 Matches if the name of the file under consideration matches
221 any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob
222 pattern, such as `*.php`. See *fnmatch*(3) for more
223 information.
224
225*pathname* 'PATTERN'...::
226
227 Matches if the entire path (relative as considered from the
228 root directory being served) of the file under consideration
229 matches any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob
230 pattern, except that slashes are not matched by wildcards. See
231 *fnmatch*(3) for more information.
232
233*default*::
234
235 Matches if and only if no *match* stanza without a *default*
16c2bec3 236 rule matches (in any configuration file).
e7e3e593 237
7711283c
FT
238*local*::
239
2f942860
FT
240 Valid only in `.htrc` files, *local* matches if and only if
241 the file under consideration resides in the same directory as
242 the containing `.htrc` file.
7711283c 243
e7e3e593
FT
244In addition to the rules, a *match* stanza must contain exactly one
245follow-up line specifying the action to take if it matches. The
246following actions are recognized:
247
248*handler* 'HANDLER'::
249
250 'HANDLER' must be a named handler (see CONFIGURATION
251 above). The named handler is searched for not only in the same
252 configuration file as the *match* stanza, but in all
253 configuration files that are valid for the file under
254 consideration, in order of distance. As such, a more deeply
255 nested `.htrc` file may override the specified handler without
256 having to specify any new *match* stanzas.
257
258*fork* 'PROGRAM' ['ARGS'...]::
259
260 Run a transient handler for this file, as if it were specified
261 by a *fchild* stanza. This action exists mostly for
262 convenience.
263
8cc893f5
FT
264A *match* stanza may also contain any number of the following,
265optional directives:
77a840e5
FT
266
267*set* 'HEADER' 'VALUE'::
268
269 If the *match* stanza is selected as the match for a file, the
270 named HTTP 'HEADER' in the request is set to 'VALUE' before
271 passing the request on to the specified handler.
272
8cc893f5
FT
273*xset* 'HEADER' 'VALUE'::
274
275 *xset* does exactly the same thing as *set*, except the
276 'HEADER' is automatically prepended with the `X-Ash-`
277 prefix. The intention is only to make configuration files
278 look nicer in this very common case.
279
b70b2d4f
FT
280404 RESPONSES
281-------------
282
16c2bec3 283A HTTP 404 response is sent to the client if
b70b2d4f 284
16c2bec3
FT
285 * The mapping procedure fails to find a matching physical file;
286 * A path element is encountered during mapping which, after URL
287 unescaping, either begins with a dot or contains slashes;
288 * The mapping procedure finds a file which is neither a directory nor
289 a regular file;
290 * An empty, non-final path element is encountered during mapping; or
291 * The mapping procedure results in a file which is not matched by any
b70b2d4f
FT
292 *match* stanza.
293
16c2bec3 294By default, *dirplex* will send a built-in 404 response, but any
b70b2d4f
FT
295`.htrc` file or global configuration may define a request handler
296named `.notfound` to customize the behavior. Note that, unlike
297successful requests, such a handler will not be passed the
298`X-Ash-File` header.
299
300The built-in `.notfound` handler can also be used in *match* or
16c2bec3
FT
301*capture* stanzas (for example, to restrict access to certain files or
302directories).
e7e3e593
FT
303
304EXAMPLES
305--------
306
307The *sendfile*(1) program can be used to serve HTML files as follows.
308
309--------
eb968b93
FT
310fchild send
311 exec sendfile
312
e7e3e593
FT
313match
314 filename *.html
eb968b93
FT
315 xset content-type text/html
316 handler send
e7e3e593
FT
317--------
318
319Assuming the PHP CGI interpreter is installed on the system, PHP
320scripts can be used with the following configuration, using the
321*callcgi*(1) program.
322
323--------
16c2bec3
FT
324# To use plain CGI, which uses more resources per handled request,
325# but less static resources:
e7e3e593
FT
326fchild php
327 exec callcgi -p php-cgi
16c2bec3
FT
328
329# To use FastCGI, which keeps PHP running at all times, but uses less
330# resources per handled request:
331child php
332 exec callfcgi multifscgi 5 php-cgi
333
e7e3e593
FT
334match
335 filename *.php
336 handler php
337--------
338
339If there is a directory without an index file, a file listing can be
340automatically generated by the *htls*(1) program as follows.
341
342--------
343match directory
344 default
345 fork htls
346--------
347
16c2bec3
FT
348The following configuration can be placed in a `.htrc` file in order
349to dedicate the directory containing that file to some external SCGI
350script engine. Note that *callscgi*, and therefore the script engine
351itself, is started in the directory itself, so that arbitrary code
352modules or data files can be put directly in that directory and easily
353found.
e7e3e593
FT
354
355--------
356child foo
357 exec callscgi scgi-wsgi -p . foo
358
359capture foo
360--------
361
362AUTHOR
363------
364Fredrik Tolf <fredrik@dolda2000.com>
365
366SEE ALSO
367--------
368*ashd*(7)