| 1 | dirplex(1) |
| 2 | ========== |
| 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | dirplex - Physical directory handler for ashd(7) |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | *dirplex* [*-hN*] [*-c* 'CONFIG'] 'DIR' |
| 11 | |
| 12 | DESCRIPTION |
| 13 | ----------- |
| 14 | |
| 15 | The *dirplex* handler maps URLs into physical files or directories, |
| 16 | and, having found a matching file or directory, it performs various |
| 17 | kinds of pattern-matching against its physical name to determine what |
| 18 | handler to call in order to serve the request. The mapping procedure |
| 19 | and pattern matching are described below. |
| 20 | |
| 21 | Having found a handler to serve a file or directory with, *dirplex* |
| 22 | adds the `X-Ash-File` header to the request with a path to the |
| 23 | physical file, before passing the request on to the handler. |
| 24 | |
| 25 | *dirplex* is a persistent handler, as defined in *ashd*(7). |
| 26 | |
| 27 | OPTIONS |
| 28 | ------- |
| 29 | |
| 30 | *-h*:: |
| 31 | |
| 32 | Print a brief help message to standard output and exit. |
| 33 | |
| 34 | *-N*:: |
| 35 | |
| 36 | Do not read the global configuration file `dirplex.rc`. |
| 37 | |
| 38 | *-c* 'CONFIG':: |
| 39 | |
| 40 | Read an extra configuration file. If 'CONFIG' contains any |
| 41 | slashes, it is opened by that exact name. Otherwise, it is |
| 42 | searched for in the same way as the global configuration file |
| 43 | (see CONFIGURATION below). |
| 44 | |
| 45 | URL-TO-FILE MAPPING |
| 46 | ------------------- |
| 47 | |
| 48 | Mapping URLs into physical files is an iterative procedure, each step |
| 49 | looking in one single physical directory, starting with 'DIR'. For |
| 50 | each step, a path element is stripped off the beginning of the rest |
| 51 | string and examined, the path element being either the leading part of |
| 52 | the rest string up until (but not including) the first slash, or the |
| 53 | entire rest string if it contains no slashes. If the rest string is |
| 54 | empty, the directory being examined is considered the result of the |
| 55 | mapping. Otherwise, any escape sequences in the path element under |
| 56 | consideration are unescaped before examining it. |
| 57 | |
| 58 | If the path element names a directory in the current directory, the |
| 59 | procedure continues in that directory, unless there is nothing left of |
| 60 | the rest string, in which case *dirplex* responds with a HTTP 301 |
| 61 | redirect to the same URL, but ending with a slash. Otherwise, the |
| 62 | remaining rest string begins with a slash, which is stripped off |
| 63 | before continuing. If the path element names a file, that file is |
| 64 | considered the result of the mapping (even if the rest string has not |
| 65 | been exhausted yet). |
| 66 | |
| 67 | If the path element does not name anything in the directory under |
| 68 | consideration, but contains no dots, then the directory is searched |
| 69 | for a file whose name before the first dot matches the path |
| 70 | element. If there is such a file, it is considered the result of the |
| 71 | mapping. |
| 72 | |
| 73 | If the result of the mapping procedure is a directory, it is checked |
| 74 | for the presence of a filed named by the *index-file* configuration |
| 75 | directive (see CONFIGURATION below). If there is such a file, it is |
| 76 | considered the final result instead of the directory itself. If the |
| 77 | index file name contains no dots and there is no exact match, then, |
| 78 | again, the directory is searched for a file whose name before the |
| 79 | first dot matches the index file name. |
| 80 | |
| 81 | See also 404 RESPONSES below. |
| 82 | |
| 83 | CONFIGURATION |
| 84 | ------------- |
| 85 | |
| 86 | Configuration in *dirplex* comes from several sources. When *dirplex* |
| 87 | starts, unless the *-N* option is given, it tries to find a global |
| 88 | configuration file named `dirplex.rc`. It looks in `$HOME/.ashd/etc`, |
| 89 | and then in all directories named by the *PATH* environment variable, |
| 90 | appended with `../etc/ashd`. For example, then, if *PATH* is |
| 91 | `/usr/local/bin:/bin:/usr/bin`, the directories `$HOME/.ashd/etc`, |
| 92 | `/usr/local/etc/ashd`, `/etc/ashd` and `/usr/etc/ashd` are searched |
| 93 | for `dirplex.rc`, in that order. Only the first file found is used, |
| 94 | should there exist several. |
| 95 | |
| 96 | If the *-c* option is given to *dirplex*, it too specifies a |
| 97 | configuration file to load. If the name given contains any slashes, it |
| 98 | is opened by that exact name. Otherwise, it is searched for in the |
| 99 | same manner as the global configuration file. |
| 100 | |
| 101 | In addition, all directories traversed by *dirplex* when mapping a URL |
| 102 | into a physical file may contain a file called `.htrc`, which may |
| 103 | specify extra configuration options for all files in and beneath that |
| 104 | directory. |
| 105 | |
| 106 | `.htrc` files are checked periodically and reread if changed. The |
| 107 | global configuration file and any file named by the *-c* option, |
| 108 | however, are never reexamined. |
| 109 | |
| 110 | When using the configuration files for deciding what to do with a |
| 111 | found file, they are examined in order of their "distance" from that |
| 112 | file. `.htrc` files found in the directory or directories containing |
| 113 | the file are considered "closest" to the file under consideration, |
| 114 | followed by any configuration file named by the *-c* option, followed |
| 115 | by the global configuration file. |
| 116 | |
| 117 | Each configuration file is a sequence of configuration stanzas, each |
| 118 | stanza being an unindented starting line, followed by zero or more |
| 119 | indented follow-up lines adding options to the stanza. The starting |
| 120 | line of a stanza is referred to as a "configuration directive" |
| 121 | below. Each line is a sequence of whitespace-separated words. A word |
| 122 | may contain whitespace if such whitespace is escaped, either by |
| 123 | enclosing the word in double quotes, or by escaping individual |
| 124 | whitespace characters with a preceding backslash. Backslash quoting |
| 125 | may also be used to treat double quotes or another backslash literally |
| 126 | as part of the word. Empty lines are ignored, and lines whose first |
| 127 | character after leading whitespace is a hash character (`#`) are |
| 128 | treated as comments and ignored. |
| 129 | |
| 130 | The following configuration directives are recognized: |
| 131 | |
| 132 | *include* ['FILENAME'...]:: |
| 133 | |
| 134 | Read the named files and act as if their contents stood in |
| 135 | place of the *include* stanza. A 'FILENAME' may be a glob |
| 136 | pattern, in which case all matching files are used, sorted by |
| 137 | their filenames. If a 'FILENAME' is a relative path, it is |
| 138 | treated relative to the directory containing the file from |
| 139 | which the *include* stanza was read, even if the inclusion has |
| 140 | been nested. Inclusions may be nested to any level. |
| 141 | |
| 142 | *index-file* ['FILENAME'...]:: |
| 143 | |
| 144 | The given 'FILENAMEs' are used for finding index files (see |
| 145 | URL-TO-FILE MAPPING above). Specifying *index-file* overrides |
| 146 | entirely any previous specification in a more distant |
| 147 | configuration file, rather than adding to it. Zero 'FILENAMEs' |
| 148 | may be given to turn off index file searching completely. The |
| 149 | *index-file* directive accepts no follow-up lines. |
| 150 | |
| 151 | *child* 'NAME':: |
| 152 | |
| 153 | Declares a named, persistent request handler (see *ashd*(7) |
| 154 | for a more detailed description of persistent handlers). It |
| 155 | must contain exactly one follow-up line, *exec* 'PROGRAM' |
| 156 | ['ARGS'...], specifying the program to execute and the |
| 157 | arguments to pass it. If given in a `.htrc` file, the program |
| 158 | will be started in the same directory as the `.htrc` file |
| 159 | itself. The *child* stanza itself serves as the identity of |
| 160 | the forked process -- only one child process will be forked |
| 161 | per stanza, and if that child process exits, it will be |
| 162 | restarted the next time the stanza would be used. If a `.htrc` |
| 163 | file containing *child* stanzas is reloaded, any currently |
| 164 | running children are reused for *child* stanzas in the new |
| 165 | file with matching names (even if the *exec* line has |
| 166 | changed). |
| 167 | |
| 168 | *fchild* 'NAME':: |
| 169 | |
| 170 | Declares a named, transient request handler (see *ashd*(7) for |
| 171 | a more detailed description of transient handlers). It must |
| 172 | contain exactly one follow-up line, *exec* 'PROGRAM' |
| 173 | ['ARGS'...], specifying the program to execute and the |
| 174 | arguments to pass it. In addition to the specified arguments, |
| 175 | the HTTP method, raw URL and the rest string will be appended |
| 176 | as described in *ashd*(7). If given in a `.htrc` file, the |
| 177 | program will be started in the same directory as the `.htrc` |
| 178 | file itself. |
| 179 | |
| 180 | *match* ['TYPE']:: |
| 181 | |
| 182 | Specifies a filename pattern-matching rule. The |
| 183 | pattern-matching procedure and the follow-up lines accepted by |
| 184 | this stanza are described below, under MATCHING. |
| 185 | |
| 186 | *capture* 'HANDLER' ['FLAGS']:: |
| 187 | |
| 188 | Only meaningful in `.htrc` files. If a *capture* directive is |
| 189 | specified, then the URL-to-file mapping procedure as described |
| 190 | above is aborted as soon as the directory containing the |
| 191 | `.htrc` file is encountered. The request is passed, with any |
| 192 | remaining rest string, to the specified 'HANDLER', which must |
| 193 | be a named request handler specified either in the same |
| 194 | `.htrc` file or elsewhere. The *capture* directive accepts no |
| 195 | follow-up lines. Note that the `X-Ash-File` header is not |
| 196 | added to requests passed via *capture* directives. Normally, |
| 197 | *capture* directives will be ignored if they appear in the |
| 198 | root directory that *dirplex* serves, but not if 'FLAGS' |
| 199 | contain the character `D`. |
| 200 | |
| 201 | MATCHING |
| 202 | -------- |
| 203 | |
| 204 | When a file or directory has been found by the mapping procedure (see |
| 205 | URL-TO-FILE MAPPING above), the name of the physical file is examined |
| 206 | to determine a request handler to pass the request to. Note that only |
| 207 | the physical file name is ever considered; any logical request |
| 208 | parameters such as the request URL or the rest string are entirely |
| 209 | ignored. |
| 210 | |
| 211 | To match a file, any *match* stanzas specified by any `.htrc` file or |
| 212 | in the global configuration files are searched in order of their |
| 213 | "distance" (see CONFIGURATION above) from the actual file. Which |
| 214 | *match* stanzas are considered depends on the type of the file being |
| 215 | matched: if an ordinary file is being matched, only *match* stanzas |
| 216 | without any 'TYPE' parameter are considered, while if it is a |
| 217 | directory, only those with the 'TYPE' parameter specified as |
| 218 | *directory* are considered. 'TYPE' can also take the value *notfound*, |
| 219 | described below under 404 RESPONSES. |
| 220 | |
| 221 | A *match* stanza must contain at least one follow-up line specifying |
| 222 | match rules. All rules must match for the stanza as a whole to match. |
| 223 | The following rules are recognized: |
| 224 | |
| 225 | *filename* 'PATTERN'...:: |
| 226 | |
| 227 | Matches if the name of the file under consideration matches |
| 228 | any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob |
| 229 | pattern, such as `*.php`. See *fnmatch*(3) for more |
| 230 | information. |
| 231 | |
| 232 | *pathname* 'PATTERN'...:: |
| 233 | |
| 234 | Matches if the entire path of the file under consideration |
| 235 | matches any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob |
| 236 | pattern, except that slashes are not matched by wildcards. See |
| 237 | *fnmatch*(3) for more information. If a *pathname* rule is |
| 238 | specified in a `.htrc` file, the path will be examined as |
| 239 | relative to the directory containing the `.htrc` file, rather |
| 240 | than to the root directory being served. |
| 241 | |
| 242 | *default*:: |
| 243 | |
| 244 | Matches if and only if no *match* stanza without a *default* |
| 245 | rule matches (in any configuration file). |
| 246 | |
| 247 | *local*:: |
| 248 | |
| 249 | Valid only in `.htrc` files, *local* matches if and only if |
| 250 | the file under consideration resides in the same directory as |
| 251 | the containing `.htrc` file. |
| 252 | |
| 253 | In addition to the rules, a *match* stanza must contain exactly one |
| 254 | follow-up line specifying the action to take if it matches. The |
| 255 | following actions are recognized: |
| 256 | |
| 257 | *handler* 'HANDLER':: |
| 258 | |
| 259 | 'HANDLER' must be a named handler (see CONFIGURATION |
| 260 | above). The named handler is searched for not only in the same |
| 261 | configuration file as the *match* stanza, but in all |
| 262 | configuration files that are valid for the file under |
| 263 | consideration, in order of distance. As such, a more deeply |
| 264 | nested `.htrc` file may override the specified handler without |
| 265 | having to specify any new *match* stanzas. |
| 266 | |
| 267 | *fork* 'PROGRAM' ['ARGS'...]:: |
| 268 | |
| 269 | Run a transient handler for this file, as if it were specified |
| 270 | by a *fchild* stanza. This action exists mostly for |
| 271 | convenience. |
| 272 | |
| 273 | A *match* stanza may also contain any number of the following, |
| 274 | optional directives: |
| 275 | |
| 276 | *set* 'HEADER' 'VALUE':: |
| 277 | |
| 278 | If the *match* stanza is selected as the match for a file, the |
| 279 | named HTTP 'HEADER' in the request is set to 'VALUE' before |
| 280 | passing the request on to the specified handler. |
| 281 | |
| 282 | *xset* 'HEADER' 'VALUE':: |
| 283 | |
| 284 | *xset* does exactly the same thing as *set*, except that |
| 285 | 'HEADER' is automatically prepended with the `X-Ash-` |
| 286 | prefix. The intention is only to make configuration files |
| 287 | look nicer in this very common case. |
| 288 | |
| 289 | 404 RESPONSES |
| 290 | ------------- |
| 291 | |
| 292 | A HTTP 404 response is sent to the client if |
| 293 | |
| 294 | * The mapping procedure fails to find a matching physical file; |
| 295 | * A path element is encountered during mapping which, after URL |
| 296 | unescaping, either begins with a dot or contains slashes; |
| 297 | * The mapping procedure finds a file which is neither a directory nor |
| 298 | a regular file (or a symbolic link to any of the same); |
| 299 | * An empty, non-final path element is encountered during mapping; or |
| 300 | * The mapping procedure results in a file which is not matched by any |
| 301 | *match* stanza. |
| 302 | |
| 303 | By default, *dirplex* will send a built-in 404 response, but there are |
| 304 | two ways to customize the response: |
| 305 | |
| 306 | First, *match* stanzas with the type *notfound* will be matched |
| 307 | against any request that would result in a 404 error. The filename for |
| 308 | such matching is that of the last succesfully found component, which |
| 309 | may be a directory, for example in case a name component could not be |
| 310 | found in the real filesystem; or a file, for example in case a file |
| 311 | was found, but not matched by any *match* stanzas. |
| 312 | |
| 313 | Otherwise, any request that would result in a 404 response but is |
| 314 | matched by no *notfound* stanza is instead passed to a default handler |
| 315 | named `.notfound`, which is handled internally in *dirplex* by |
| 316 | default, but may be overridden just as any other handler may be in a |
| 317 | `.htrc` file or by global configuration. Note, however, that any |
| 318 | request not matched by a *notfound* stanza will not have the |
| 319 | `X-Ash-File` header added to it. |
| 320 | |
| 321 | The built-in `.notfound` handler can also be used in *match* or |
| 322 | *capture* stanzas (for example, to restrict access to certain files or |
| 323 | directories). |
| 324 | |
| 325 | EXAMPLES |
| 326 | -------- |
| 327 | |
| 328 | The *sendfile*(1) program can be used to serve HTML files as follows. |
| 329 | |
| 330 | -------- |
| 331 | fchild send |
| 332 | exec sendfile |
| 333 | |
| 334 | match |
| 335 | filename *.html *.htm |
| 336 | xset content-type text/html |
| 337 | handler send |
| 338 | -------- |
| 339 | |
| 340 | Assuming the PHP CGI interpreter is installed on the system, PHP |
| 341 | scripts can be used with the following configuration, using the |
| 342 | *callcgi*(1) program. |
| 343 | |
| 344 | -------- |
| 345 | # To use plain CGI, which uses more resources per handled request, |
| 346 | # but less static resources: |
| 347 | fchild php |
| 348 | exec callcgi -p php-cgi |
| 349 | |
| 350 | # To use FastCGI, which keeps PHP running at all times, but uses less |
| 351 | # resources per handled request: |
| 352 | child php |
| 353 | exec callfcgi multifscgi 5 php-cgi |
| 354 | |
| 355 | match |
| 356 | filename *.php |
| 357 | handler php |
| 358 | -------- |
| 359 | |
| 360 | If there is a directory without an index file, a file listing can be |
| 361 | automatically generated by the *htls*(1) program as follows. |
| 362 | |
| 363 | -------- |
| 364 | match directory |
| 365 | default |
| 366 | fork htls |
| 367 | -------- |
| 368 | |
| 369 | The following configuration can be placed in a `.htrc` file in order |
| 370 | to dedicate the directory containing that file to some external SCGI |
| 371 | script engine. Note that *callscgi*, and therefore the script engine |
| 372 | itself, is started in the same directory, so that arbitrary code |
| 373 | modules or data files can be put directly in that directory and be |
| 374 | easily found. |
| 375 | |
| 376 | -------- |
| 377 | child foo |
| 378 | exec callscgi scgi-wsgi -p . foo |
| 379 | |
| 380 | capture foo |
| 381 | -------- |
| 382 | |
| 383 | AUTHOR |
| 384 | ------ |
| 385 | Fredrik Tolf <fredrik@dolda2000.com> |
| 386 | |
| 387 | SEE ALSO |
| 388 | -------- |
| 389 | *ashd*(7) |