mirror of
https://github.com/boostorg/auto_index.git
synced 2025-05-09 23:24:02 +00:00
Add prefix= command line option, and updated docs to match. Changed examples to use prefix= option so that files to scan are found relative to boost-root. [SVN r50994]
422 lines
14 KiB
Plaintext
422 lines
14 KiB
Plaintext
[article AutoIndex
|
|
[quickbook 1.4]
|
|
[copyright 2008 John Maddock]
|
|
[license
|
|
Distributed under the Boost Software License, Version 1.0.
|
|
(See accompanying file LICENSE_1_0.txt or copy at
|
|
[@http://www.boost.org/LICENSE_1_0.txt])
|
|
]
|
|
[authors [Maddock, John]]
|
|
[/last-revision $Date: 2008-11-04 17:11:53 +0000 (Tue, 04 Nov 2008) $]
|
|
]
|
|
|
|
[section:overview Overview]
|
|
|
|
AutoIndex is a tool for taking the grunt work out of indexing a
|
|
Quickbook\/Boostbook\/Docbook document that describes C\/C++ code.
|
|
|
|
Traditionally, in order to index a Docbook document you would
|
|
have to manually add a large amount of `<indexterm>` markup:
|
|
in fact one `<indexterm>` for each occurance of each term to be
|
|
indexed.
|
|
|
|
Instead AutoIndex will scan one or more C\/C++ header files
|
|
and extract all the ['function], ['class], ['macro] and ['typedef]
|
|
names that are defined by those headers, and then insert the
|
|
`<indexterm>`'s into the XML document for you.
|
|
|
|
AutoIndex creates index entries as follows - for each occurance of
|
|
each search term, it creates two index entries - one has the search term
|
|
as the primary index key and the title of the section it appears in as
|
|
a subterm, the other has the section title as the main index entry and the
|
|
search term as the subentry. Thus the user has two chances to find what their
|
|
looking for, based upon either the section name or the ['function], ['class], ['macro]
|
|
or ['typedef] name.
|
|
|
|
So for example in Boost.Math the class name `students_t_distribution` has a primary
|
|
entry that lists all sections it appears in:
|
|
|
|
[$../students_t_eg_1.png]
|
|
|
|
Then those sections also have primary entries, which list all the search terms those
|
|
sections contain:
|
|
|
|
[$../students_t_eg_2.png]
|
|
|
|
Of course these automated index entries may not be quite
|
|
what you're looking for: often you'll get a few spurious entries, a few missing entries,
|
|
and a few entries where the section name used as an index entry is less than ideal.
|
|
So AutoIndex provides some powerful regular expression based rules that allow you
|
|
to add, remove, constrain, or rewrite entries. Normally just a few lines in
|
|
AutoIndex's script file are enough to tailor the output to match the authors
|
|
expectations.
|
|
|
|
AutoIndex also supports multiple indexes (as does Docbook), and since it knows
|
|
which search terms are ['function], ['class], ['macro] or ['typedef] names, it
|
|
can add the necessary attritubes to the XML so that you can have separate
|
|
indexes for each of these different types. These specialised indexes only contain
|
|
entries for the ['function], ['class], ['macro] or ['typedef] names, ['section
|
|
names] are never used as primary index terms here, unlike the main "include everything"
|
|
index.
|
|
|
|
Finally, while the Docbook XSL stylesheets create nice indexes complete with page
|
|
numbers for PDF output, the HTML indexes look a lot less good, as these use
|
|
section titles in place of page numbers... but as AutoIndex uses section titles
|
|
as index entries this leads to a lot of repetition, so as an alternative AutoIndex
|
|
can be instructed to construct the index itself. This is faster than using
|
|
the XSL stylesheets, and now each index entry is a hyperlink to the
|
|
approprate section:
|
|
|
|
[$../students_t_eg_3.png]
|
|
|
|
With internal index generation there is also a helpful navigation bar
|
|
at the start of each Index:
|
|
|
|
[$../students_t_eg_4.png]
|
|
|
|
[endsect]
|
|
|
|
[section:tut Getting Started and Tutorial]
|
|
|
|
[h4 Step 1: Build the tool]
|
|
|
|
[note This step is strictly optional, but can speed up build times.]
|
|
|
|
cd into `tools/auto_index/build` and invoke bjam as:
|
|
|
|
bjam release
|
|
|
|
Optionally pass the name of the compiler toolset you want to use to bjam as well:
|
|
|
|
bjam release gcc
|
|
|
|
Now open up your user-config.jam file and at the end add the line:
|
|
|
|
[pre
|
|
using auto-index : ['full-path-of-executable] ;
|
|
]
|
|
|
|
[note
|
|
This declaration must go towards the end of user-config.jam, or in any case after the Boostbook initialisation.
|
|
|
|
Also note that Windows users must use forward slashes in the paths in user-config.jam]
|
|
|
|
Finally copy `tools/auto_index/auto-index.jam` into the same directory as the rest of the Boost.Build tools
|
|
(under `tools/build/v2/tools` in your main Boost tree): this is a temporary requirement that will go away
|
|
if the tool is accepted into Boost.
|
|
|
|
[h4 Step 2: Configure Boost.Build]
|
|
|
|
Assuming you have a Jamfile for building your documentation that looks
|
|
something like:
|
|
|
|
[pre
|
|
boostbook standalone
|
|
:
|
|
type_traits
|
|
:
|
|
# build requirements go here:
|
|
;
|
|
]
|
|
|
|
Then add the line:
|
|
|
|
[pre using auto-index ; ]
|
|
|
|
to the start of the Jamfile, and then add whatever auto-index options
|
|
you want to the build requirements section, for example:
|
|
|
|
[pre
|
|
boostbook standalone
|
|
:
|
|
type_traits
|
|
:
|
|
# build requirements go here:
|
|
|
|
# this one turns on indexing:
|
|
<auto-index>on
|
|
# choose indexing method for pdf's:
|
|
<format>pdf:<auto-index-internal>off
|
|
# choose indexing method for html:
|
|
<format>html:<auto-index-internal>on
|
|
# set the name of the script file to use:
|
|
<auto-index-script>index.idx
|
|
;
|
|
]
|
|
|
|
The available options are:
|
|
|
|
[variablelist
|
|
[[<auto-index>off/on][Turns indexing of the document on, defaults to
|
|
"off", so be sure to set this if you want AutoIndex invoked!]]
|
|
[[<auto-index-internal>off/on][Chooses whether AutoIndex creates the index
|
|
itself (feature on), or whether it simply inserts the necessary DocBook
|
|
markup so that the DocBook XSL stylesheets can create the index.]]
|
|
[[<auto-index-script>filename][Specifies the name of the script to load.]]
|
|
[[<auto-index-no-duplicates>off/on][When "on" AutoIndex will only index a term
|
|
once in any given section, otherwise (the default) multiple index entries per
|
|
term may be created if the term occurs more than once in the section.]]
|
|
[[<auto-index-verbose>off/on][Defaults to "off". When turned on the AutoIndex
|
|
prints progress information - generally useful only for debugging purposes.]]
|
|
[[<auto-index-prefix>filename][Specifies a directory to apply as a prefix to all relative file paths in the script file.]]
|
|
]
|
|
|
|
|
|
[h4 Step 3: Add indexes to your documentation]
|
|
|
|
To add a single index to a BoostBook\/Docbook document, then add
|
|
`<index/>` at the location where you want the index to appear. The
|
|
index will be rendered as a separate section when the documentation
|
|
is built.
|
|
|
|
To add multiple indexes, then give each one a title and set it's
|
|
`type` attribute to specify which terms will be included, for example
|
|
to place the ['function], ['class], ['macro] or ['typedef] names
|
|
indexed by ['auto_index] in separate indexes along with a main
|
|
"include everything" index as well, one could add:
|
|
|
|
[pre
|
|
<index type\="class_name">
|
|
<title>Class Index</title>
|
|
</index>
|
|
|
|
<index type\="typedef_name">
|
|
<title>Typedef Index</title>
|
|
</index>
|
|
|
|
<index type\="function_name">
|
|
<title>Function Index</title>
|
|
</index>
|
|
|
|
<index type\="macro_name">
|
|
<title>Macro Index</title>
|
|
</index>
|
|
|
|
<index/>
|
|
]
|
|
|
|
In quickbook, you add the same markup but enclose it in an escape:
|
|
|
|
'''<index/>'''
|
|
|
|
[h4 Step 4: Create the script file]
|
|
|
|
AutoIndex works by reading a script file that tells it what to index,
|
|
at it's simplest it will scan one or more headers for terms that
|
|
should be indexed in the documentation. So for example to scan
|
|
"myheader.hpp" the script file would just contain:
|
|
|
|
!scan myheader.hpp
|
|
|
|
Or we can recursively scan through directories looking for all
|
|
the files to scan whose name matches a particular regular expression:
|
|
|
|
[pre !scan-path "../../../../boost/math" ".*\.hpp" true ]
|
|
|
|
Note how each argument is whitespace separated and can be optionally
|
|
enclosed in "double quotes". The final ['true] argument indicates
|
|
that subdirectories in `../../../../boost/math` should be searched
|
|
in addition to that directory.
|
|
|
|
Often the ['scan] or ['scan-path] rules will bring in too many terms
|
|
to search for, so we need to be able to exclude terms as well:
|
|
|
|
!exclude type
|
|
|
|
Which excludes the term "type" from being indexed.
|
|
|
|
We can also add terms manually:
|
|
|
|
foobar
|
|
|
|
will index occurances of "foobar" and:
|
|
|
|
foobar \<\w*(foo|bar)\w*\>
|
|
|
|
will index any whole word containing either "foo" or "bar" within it,
|
|
this is useful when you want to index a lot of similar or related
|
|
words under one entry, for example:
|
|
|
|
reflex
|
|
|
|
Will only index occurances of "reflex" as a whole word, but:
|
|
|
|
reflex \<reflex\w*\>
|
|
|
|
will index occurances of "reflex", reflexing" and
|
|
"reflexed" all under the same entry ['reflex].
|
|
|
|
This inclusion rule can also restict the term to
|
|
certain sections, and add an index category that
|
|
the term should belong to (so it only appears in certain
|
|
indexes).
|
|
|
|
Finally the script can add rewrite rules, that rename section names
|
|
that are automatically used as index entries. For example we might
|
|
want to remove leading "A" or "The" prefixes from section titles
|
|
when AutoIndex uses them as an index entry:
|
|
|
|
!rewrite-name "(?i)(?:A|The)\s+(.*)" "\1"
|
|
|
|
[h4 Step 5: Build the Your Docs]
|
|
|
|
Make sure that auto-index.jam is in your BOOST_BUILD_PATH, by either
|
|
setting the environment variable BOOST_BUILD_PATH to point to the directory
|
|
containing it, or by copying the file into
|
|
`boost-root/tools/build/v2/tools`. Then you build the docs with either:
|
|
|
|
bjam release
|
|
|
|
To build the html docs or:
|
|
|
|
bjam pdf release
|
|
|
|
To build the pdf.
|
|
|
|
During the build process you should see AutoIndex emit a message
|
|
such as:
|
|
|
|
[pre Indexing 990 terms... ]
|
|
|
|
If you don't see that, or if it's indexing 0 terms then something is wrong!
|
|
|
|
[h4 Step 6: Iterate]
|
|
|
|
Creating a good index is an iterative process, often the first step is
|
|
just to add a header scanning rule to the script file and then generate
|
|
the documentation and see:
|
|
|
|
* What's missing.
|
|
* What's been included that shouldn't be.
|
|
* What's been included under a poor name.
|
|
|
|
Further rules can then be added to the script to handle these cases
|
|
and the next iteration examined, and so on.
|
|
|
|
[endsect]
|
|
|
|
[section:script_ref Script File Reference]
|
|
|
|
The following elements can occur in a script:
|
|
|
|
[h4 Simple Inclusions]
|
|
|
|
term [regular-expression1 [regular-expression2 [category]]]
|
|
|
|
[variablelist
|
|
[[term][The term to index: this will form a primary entry in the Index
|
|
with the section title(s) containing the term as secondary entries, and
|
|
also will be used as a secondary entry beneath each of the section
|
|
titles that the term occurs in.]]
|
|
|
|
[[regular-expression1][An optional regular expression: each occurance
|
|
of the regular expression in the text of the document will result
|
|
in one index term being emitted.
|
|
|
|
If the regular expression is omitted or is "", then the ['term] itself
|
|
will be used as the search text - and only occurance of whole words matching
|
|
['term] will be indexed.]]
|
|
|
|
[[regular-expression2][A constraint that specifies which sections are
|
|
indexed for ['term]: only if the ID of the section matches
|
|
['regular-expression2] exactly will that section be indexed for occurances
|
|
of ['term].
|
|
|
|
For example:
|
|
|
|
`myclass "" "mylib.examples.*"`
|
|
|
|
Will index occurances of "myclass" as a whole word only in sections
|
|
whose ID begins "mylib.examples", while:
|
|
|
|
`myclass "" "(?!mylib.introduction.*).*"`
|
|
|
|
will index occurances of "myclass" in any section, except those whose
|
|
ID's begin "mylib.introduction".]]
|
|
|
|
[[category][Optionally an index category to place occurances of
|
|
['term] in. If you have multiple indexes then this is the name
|
|
assigned to the indexes "type" attribute.
|
|
]]
|
|
|
|
]
|
|
|
|
[h4 Source File Scanning]
|
|
|
|
!scan source-file-name
|
|
|
|
Scans the C\/C++ source file ['source-file-name] for definitions of
|
|
['function]'s, ['class]'s, ['macro]'s or ['typedef]'s and makes each of
|
|
these a term to be indexed. Terms found are assigned to the index category
|
|
"function_name", "class_name", "macro_name" or "typedef_name" depending
|
|
on how they were seen in the source file. These may then be included
|
|
in a specialised index whose "type" attribute has the same category name.
|
|
|
|
[h4 Directory and Source File Scanning]
|
|
|
|
!scan-path directory-name file-name-regex [recurse]
|
|
|
|
[variablelist
|
|
[[directory-name][The directory to scan: this should be a path relative
|
|
to the script file and should use all forward slashes in it's file name.]]
|
|
|
|
[[file-name-regex][A regular expression: any file in the directory whose name
|
|
matches the regular expression will be scanned for terms to index.]]
|
|
|
|
[[recurse][An optional boolian value - either "true" or "false" - that
|
|
indicates whether to recurse into subdirectories.]]
|
|
]
|
|
|
|
[h4 Excluding Terms]
|
|
|
|
!exclude term-list
|
|
|
|
Excludes all the terms in whitespace separated ['term-list] from being indexed.
|
|
This should be placed /after/ any ['!scan] or ['!scan-path] rules which may
|
|
result in the terms becoming included.
|
|
|
|
[h4 Rewriting Section Names]
|
|
|
|
!rewrite-id regular-expression new-name
|
|
|
|
[variablelist
|
|
[[regular-expression][A regular expression: all section ID's that match
|
|
the expression exactly will have index entries ['new-name] instead of
|
|
their title(s).]]
|
|
|
|
[[new-name][The name that the section will appear under in the index.]]
|
|
]
|
|
|
|
!rewrite-name regular-expression format-text
|
|
|
|
[variablelist
|
|
[[regular-expression][A regular expression: all sections whose titles
|
|
match the regular expression exactly, will have index entries composed
|
|
of the regular expression match combined with the regex format string
|
|
['format-text].]]
|
|
[[format-text][The Perl-style format string used to reformat the title.]]
|
|
]
|
|
|
|
[endsect]
|
|
|
|
[section:comm_ref Command Line Reference]
|
|
|
|
The following command line options are supported by auto_index:
|
|
|
|
[variablelist
|
|
[[in=infilename][Specifies the name of the XML input file to be indexed.]]
|
|
[[out=outfilename][Specifies the name of the new XML file to create.]]
|
|
[[scan=source-filename][Specifies that ['source-filename] should be scanned
|
|
for terms to index.]]
|
|
[[script=script-filename][Specifies the name of the script file to process.]]
|
|
[[--no-duplicates][If a term occurs more than once in the same section, then
|
|
include only one index entry.]]
|
|
[[--internal-index][Specifies that auto_index should generate the actual
|
|
indexes rather than inserting `<indexterm>`'s and leaving index generation
|
|
to the XSL stylesheets.]]
|
|
[[prefix=pathname][Specifies a directory to apply as a prefix to all relative file paths in the script file.]]
|
|
]
|
|
|
|
[endsect]
|
|
|