libs-base/Tools/HTMLLinker.gsdoc
Adam Fedor 7dd8a09a24 Readded file
git-svn-id: svn+ssh://svn.gna.org/svn/gnustep/libs/base/trunk@25822 72102866-910b-0410-8b05-ffd578937521
2008-01-01 21:55:55 +00:00

417 lines
18 KiB
Text

<chapter>
<heading>The HTMLLinker tool</heading>
<section>
<heading>Introduction</heading>
<p>
The GNUstep HTML linker is able to fixup links from one HTML
document to other HTML ones. By link we mean the standard
<code>&lt;a href="NSString.html#DescriptionOfNSString"&gt;</code>
tag. By fixing up a link we mean to modify the path in the
<code>href</code> so that it points to the actual file on disk.
For example, if you the <code>DescriptionOfNSString</code>
location is in the file <code>NSStringOverview.html</code> in the
directory <code>/home/nicola/Doc</code>, when the linker fixes up
the <code>&lt;a
href="NSString.html#DescriptionOfNSString"&gt;</code> link, it
will replace it with <code>&lt;a
href="/home/nicola/Doc/NSStringOverview.html#DescriptionOfNSString"&gt;</code>.
Please note that when fixing up the link, the linker modifies both
the path and the file name that the link points to, but not the
location inside the file (the <code>DescriptionOfNSString</code>
in the example).
</p>
</section>
<section>
<heading>Practical Usage of the linker</heading>
The typical usage of the linker is with maintaining
cross-references in software documentation. You need to establish
some sort of convention used by all your software documentation
for the link names. For example, suppose that your documentation
is about C libraries. For each C function, you might decide to
tag its documentation in the files with the name
<code>function$function_name</code>. For example, the place in
the doc where it documents the <code>start_library()</code>
function would have the HTML tag <code>&lt;a
name="function$start_library"&gt;</code>. Having established this
convention, in any HTML file in your documentation in which you
want to create a link to the documentation for the
<code>start_library()</code> function, you use the code
<code>&lt;a rel="dynamic"
href="#function$start_library"&gt;</code> (please note that you
ignore the problem of locating the actual file which contains the
documentation for the <code>start_library()</code> function, that
is precisely what the linker will do for you). Whenever you
install the documentation for a new project, you first create a
relocation file for the project documentation, by running
<pre>
HTMLLinker -BuildRelocationFileForDir Documentation
</pre>
if for example the project documentation is in the
<code>Documentation</code> subdirectory. This will create a
<code>Documentation/table.htmlink</code> file, which contains a
list of all names found in the project documentation, and for each
of them, the file in which it's found. Then, you install the
project documentation (say for example that it's installed into
<code>/opt/gnustep/Local/Documentation/MyProject</code>), and once
it's installed, you can run the linker to update all links so that
they point to the actual files
<pre>
HTMLLinker /opt/gnustep/Local/Documentation/MyProject \
-l /opt/gnustep/Local/Documentation/MyProject \
-l /opt/gnustep/Local/Documentation/MyOtherProject
</pre>
This will fixup all links in <code>MyProject</code>'s HTML files
by using the relocation files of both <code>MyProject</code> and
<code>MyOtherProject</code>, so all links to anything which is
documented inside those files will be generated correctly.
</section>
<section>
<heading>Usage of the tool with autogsdoc</heading>
You can use the tool with documentation generated by autogsdoc to
perform the linking (or to relink it). Make sure to use the option
<code>-LinksMarker gsdoc</code> because autogsdoc marks the links
to be fixed up by the linker by using <code>rel="gsdoc"</code>.
</section>
<section>
<heading>Modes of operation</heading>
The HTML linker works in two phases:
<ul>
<li> The first (called <i>generation of the relocation
table</i>) preprocesses a given set of HTML files so that it can
be the destination of links. It builds a relocation table for
the given set of HTML files. This relocation table simply maps
all names (as in <code>&lt;a name="xxx"&gt;</code>) in the files
to the file in which the name is found. The HTML files are not
touched. The linker is able to merge this dynamically generated
relocation table with pregenerated relocation tables loaded from
files (called <i>relocation files</i>).
</li>
<li> The second (called <i>linking</i>) links a given file to
the available HTML files on disk, by using the relocation table
to modify the HTML links in the file so that they point to
existing files.
</li>
</ul>
The HTML linker can also be run in a special mode, to generate a
relocation file for later reuse. In this mode, the HTML linker
will build the relocation table for all files in a directory, then
save the relocation table into a <code>table.htmlink</code> file
in that directory for later reuse.
There are three kinds of files:
<ul>
<li>
<em>input files</em>: these are HTML files which are modified
as a consequence of linking; they have their links fixed up.
</li>
<li>
<em>destination files</em>: these are HTML files which are
read to produce relocation tables.
</li>
<li>
<em>relocation files</em>: these files are not HTML files -
they are only created and read by the linker (unless you have
a tool which can manage them), and are in a specific - very
simple - format. They are used to save relocation information
for later reuse, so that the linker can run faster. Normally,
they have a <code>.htmlink</code> extension.
</li>
</ul>
</section>
<section>
<heading>Linker behaviour</heading>
The linker keeps a main relocation table, which is empty at the
beginning. When run, the linker performs the following steps:
<ol>
<li>
the linker reads and parses all relocation files specified on
the command line, and merges the relocation tables found there
into the main relocation table.
</li>
<li>
the linker reads and parses all destination files specified on
the command line, and builds a relocation table for them,
merging it into the main relocation table.
</li>
<li>
if any input files are specified on the command line, the
linker links the files using the relocation table.
</li>
</ol>
</section>
<heading>Specifying input, destination and relocation files</heading>
All command line arguments which do not begin with a hypen
(<code>-</code>), and which are not the values of defaults (for
example, not the <code>YES</code> in <code>-Warn YES</code>,
because that is the value of the default <code>-Warn</code>), are
interpreted as input files. Each destination file is specified by
using a <code>-d</code> option, and each relocation file by using
a <code>-l</code> option. If a directory is specified as an input
(or destination) file, the linker will recurse into the directory
and add to the list of input (or destination) files all files in
the directory (and in the directory's subdirectories, no matter
how deeply nested) which have one of the following extensions:
<code>.html</code>, <code>.HTML</code>, <code>.htm</code> or
<code>.HTM</code>. If a directory is specified as a relocation
file, the linker will add to the list of relocation files all
files in the directory which have the extension
<code>.htmlink</code>. A typical invocation of the linker is as
follows:
<pre>
HTMLLinker -BuildRelocationFileForDir Doc
</pre>
Builds a relocation file for the documentation in the
directory <code>Doc</code>. After this has been done, the
directory <code>Doc</code> can be used as a <code>-l</code>
argument.
<pre>
HTMLLinker test.html -l Doc
</pre>
Links the file <code>test.html</code> using the relocation file
just generated in the <code>Doc</code> directory.
<heading>What is a link</heading>
A link is an anchor tag with and <code>href</code>, such as
<code>&lt;a href="dest.html#location"&gt;</code>. The destination
file of the link is the file specified in the <code>href</code>;
<code>dest.html</code> in the example. The destination file is
ignored by the linker; the name of the link (which is everything which
follows the <code>#</code>) is used to perform the linking.
<heading>Which links are fixed up</heading>
Normally, the linker will only fixup links which have the
<code>rel</code> attribute set to <code>dynamic</code>, as in the
following example: <code>&lt;a href="nicola.html"
rel="dynamic"&gt;</code>. In this way, you can specify in your
HTML document which links you want to be fixed up, and which you
don't want to be. You can change the type of links to be fixed up
by using the <code>-LinksMarker</code> options, as in
<code>-LinksMarker gsdoc</code>, which causes the linker to fixup
all links with the <code>rel</code> attribute set to
<code>gsdoc</code> rather than <code>dynamic</code>. In certain
situations you might want to force the linker to attempt to fixup
all links; you can run the linker with the <code>-FixupAllLinks
YES</code> option to cause this behaviour. As a special
exception, links which obviously are not to be fixed up, such as
links beginning with <i>mailto:</i> or <i>news:</i>, or links
without a name, are never fixed up.
<heading>How links are fixed up </heading>
When the HTML linker encounters a link which needs to be fixed up
(say <code>&lt;a href="dest.html#location"&gt;</code>), it
searches the relocation table for a destination file which
contains the <code>location</code> name. If no such file is
found, the HTML linker emits a warning, and replaces the link in
the file with a link to the destination without the filename. In
the example, it would simply emit <code>&lt;a
href="#location"&gt;</code>. If the destination file is found in
the list, instead, the HTML linker replaces the link with the full
path to the destination file on disk. For example, if - according
to the relocation table, the file
<code>/home/nicola/Doc/dest.html</code> contains the name
<code>location</code>, the HTML linker will fixup the link to be
<code>&lt;a href="/home/nicola/Doc/dest.html#location"&gt;</code>
(as a special exception, if there is a path mapping which matches
the path to the destination file, it's applied to the path in the
link. See below for a detailed explanation of path mappings).
It's important to notice that you must have unique link names for
the linker to work properly. For example, if you have two
different destination files containing the same name, say
<code>NSObject.html</code> and <code>NSString.html</code> both
containing the name <code>init</code>, then the linker can't
resolve <code>&lt;a href="#init"&gt;</code>, because it has no way
to know if you meant the link to point to the first or the second
destination file! You should choose names better so that they
uniquely specify what they represent contents, for example
<code>NSObject_i_init</code> and <code>NSString_i_init</code> if
the first link is in the place documenting the <code>-init</code>
method of the NSObject class and the second one the one of the
NSString class. Then all links will clearly refer to one place or
the other one, and no confusion will arise. If there are multiple
destination files for a link, the linker will guess which one is
the right one, and that might not give the desired result.
<heading>How links are checked</heading>
When a link is fixed up, the linker implicitly checks that the link
is correct, because if the link name can't be found in the relocation
tables, a warning is issued.
<heading>Path mappings</heading>
Path mappings are an additional feature of the HTML linker which
can be used when exporting documentation to be served by a web
server. If you are not putting your documentation on a web server
but simply reading it from the filesystem, then you don't need the
path mappings. The issue with exporting documentation to a web
server is that you refer to files using paths which are not
necessarily the same paths where the files are on disk. For
example, suppose that you have some HTML documentation in
<code>/opt/doc/base</code> and some other HTML documentation in
<code>/opt/doc/gui</code>. The HTML files in the two
documentation directories refer to each other. You can run the
HTML linker and fixup all links, and we are happy. But now
suppose that you set up a web server; the web server, for example,
will serve URLs beginning with <code>/Base</code> (meaning as in
requests from a browser of the form
<code>http://www.server.org/Base</code>) by taking files from
<code>/opt/doc/base</code>, and URLs beginning with
<code>Gui</code> by taking files from <code>/opt/doc/gui</code>.
To fixup the links in this case, you need path mappings. A path
mapping specifies that a certain directory on disk is to be
referred in some different way in links. In the example, you
would pass
<pre>
-PathMapping '{ "/opt/doc/base"="/Base"; "/opt/doc/gui"="/Gui"; }'
</pre>
to the linker.
Each path mapping maps a <em>path on disk</em> to a <em>virtual
path</em>. For example, it maps the path on disk
<code>/opt/doc/base</code> to the virtual path <code>/Base</code>.
Each time the linker fixes up a link, after finding the
destination file, it checks the list of path mappings. If the
path to the destination file begins with the <em>path on disk</em>
of one of the path mappings, then that <em>path on disk</em> is
replaced with the corresponding <em>virtual path</em> in the path
to the destination file before the path to the destination file is
written out in the link.
For example, if you have the path mapping explained above, and if
the linker is fixing up the link <code>&lt;a
href="hi.html#nicola"&gt;</code>, where the destination file is
<code>/opt/doc/base/nicola/hi.html</code>, then the destination
path matches the path mapping for <code>/opt/doc/base</code>, so
the path mapping is applied and the link is fixed up to be
<code>&lt;a href="/Base/nicola/hi.html#nicola"&gt;</code> rather than
<code>&lt;a href="/opt/doc/base/nicola/hi.html#nicola"&gt;</code> as it
would normally have been without the path mapping.
<heading>Specifying path mappings</heading>
<h5>On the command line</h5>
Each path mapping specifies a mapping of a path on disk to a web
server alias. The first way to specify the mappings is on the
command line, in the form of a dictionary argument to the
<code>-PathMappings</code>, as in
<pre>
-PathMappings '{ "/opt/doc/base"="/Base"; "/opt/doc/gui"="/Gui"; }'
</pre>
where <code>/opt/doc/base</code> and <code>/opt/doc/gui</code> are
the paths on disk and <code>/Base</code> and <code>/Gui</code> are
the corresponding web server URL paths.
<h5>In a path mappings file</h5>
The other way to specify mappings is to write them into a file,
in the format of a dictionary, as, for example, in a file containing
the following lines
<pre>
{
"/opt/doc/base"="/Base";
"/opt/doc/gui"="/Gui";
}
</pre>
and then tell the linker to read the path mappings from that file,
by giving the filename as option to the
<code>-PathMappingsFile</code>. For example, if the file
containing the mappings is called <code>mappings</code>, then you need
to pass
<pre>
-PathMappingsFile mappings
</pre>
to the linker to have it read mappings from the file.
<h5>Command line path mappings override file path mappings</h5>
Both command line path mappings and path mappings from a file can
be used at the same time; in case of conflict, command line path
mappings override path mappings from the file.
<h3>Summary of all the options</h3>
Each of the options beginning with a single hypen (<code>-</code>)
require an argument, as in
<pre>
HTMLLinker Documentation -LinksMarker gsdoc -d Documentation
</pre>
which sets <code>LinksMarker</code> to <code>gsdoc</code>. The
options might be anywhere on the command line. Options which do
not begin with a single hypen (such as <code>--help</code>) do not
require an argument, as in
<pre>
HTMLLinker --help
</pre>
<h4>-d</h4>
Followed by a destination HTML file, or a directory containing
destination HTML files.
<h4>-l</h4>
Followed by a relocation file, or a directory containing relocation files.
<h4>-FixupAllLinks</h4>
If set to <code>NO</code> (the default) only links containing the
<code>rel</code> attribute set to <code>dynamic</code> (or
whatever specified as <code>LinksMarkers</code>)are fixed up in
the input files. If set to <code>YES</code>, all links are fixed
up.
<h4>-LinksMarker</h4>
If set (and if <code>FixupAllLinks</code> is <code>NO</code>),
only links with the <code>rel</code> attribute set to its value
are processed. By default it is set to <code>dynamic</code>.
<h4>-PathMappings</h4>
If set to a dictionary, read the dictionary as path mappings. See
above for more details of path mappings.
<h4>-PathMappingsFile</h4>
If set to a string, consider it to be the name of a file; read
path mappings from that file. The file must contain the path
mappings in the form of a dictionary. See above for more details
on path mappings.
<h4>-Verbose</h4>
If set to <code>YES</code> prints some more messages than if set
to <code>NO</code> (the default).
<h4>--help</h4>
Prints a quick explanation of the command line syntax and exits.
<h4>--version</h4>
Prints the version and exits.
<hr>
</chapter>