The GNUstep HTML Linker
Introduction
What the HTML linker does
The GNUstep HTML linker is able to fixup links from one HTML
document to other HTML ones. By link we mean the standard
<a href="NSString.html#DescriptionOfNSString">
tag. By fixing up a link we mean to modify the path in the
href
so that it points to the actual file on disk.
For example, if you the DescriptionOfNSString
location is in the file NSStringOverview.html
in the
directory /home/nicola/Doc
, when the linker fixes up
the <a
href="NSString.html#DescriptionOfNSString">
link, it
will replace it with <a
href="/home/nicola/Doc/NSStringOverview.html#DescriptionOfNSString">
.
Please note that when fixing up the link, the linker modifies both
the path and the file name that the link points to, but not the
location inside the file (the DescriptionOfNSString
in the example).
Practical Usage of the linker
The typical usage of the linker is with maintaining
cross-references in software documentation. You need to establish
some sort of convention used by all your software documentation
for the link names. For example, suppose that your documentation
is about C libraries. For each C function, you might decide to
tag its documentation in the files with the name
function$function_name
. For example, the place in
the doc where it documents the start_library()
function would have the HTML tag <a
name="function$start_library">
. Having established this
convention, in any HTML file in your documentation in which you
want to create a link to the documentation for the
start_library()
function, you use the code
<a rel="dynamic"
href="#function$start_library">
(please note that you
ignore the problem of locating the actual file which contains the
documentation for the start_library()
function, that
is precisely what the linker will do for you). Whenever you
install the documentation for a new project, you first create a
relocation file for the project documentation, by running
HTMLLinker -BuildRelocationFileForDir Documentation
if for example the project documentation is in the
Documentation
subdirectory. This will create a
Documentation/table.htmlink
file, which contains a
list of all names found in the project documentation, and for each
of them, the file in which it's found. Then, you install the
project documentation (say for example that it's installed into
/opt/gnustep/Local/Documentation/MyProject
), and once
it's installed, you can run the linker to update all links so that
they point to the actual files
HTMLLinker /opt/gnustep/Local/Documentation/MyProject \
-l /opt/gnustep/Local/Documentation/MyProject \
-l /opt/gnustep/Local/Documentation/MyOtherProject
This will fixup all links in MyProject
's HTML files
by using the relocation files of both MyProject
and
MyOtherProject
, so all links to anything which is
documented inside those files will be generated correctly.
Usage of the tool with autogsdoc
You can use the tool with documentation generated by autogsdoc to
perform the linking (or to relink it). Make sure to use the option
-LinksMarker gsdoc
because autogsdoc marks the links
to be fixed up by the linker by using rel="gsdoc"
.
Specification
Modes of operation
The HTML linker works in two phases:
- The first (called generation of the relocation
table) preprocesses a given set of HTML files so that it can
be the destination of links. It builds a relocation table for
the given set of HTML files. This relocation table simply maps
all names (as in
<a name="xxx">
) in the files
to the file in which the name is found. The HTML files are not
touched. The linker is able to merge this dynamically generated
relocation table with pregenerated relocation tables loaded from
files (called relocation files).
- The second (called linking) links a given file to
the available HTML files on disk, by using the relocation table
to modify the HTML links in the file so that they point to
existing files.
The HTML linker can also be run in a special mode, to generate a
relocation file for later reuse. In this mode, the HTML linker
will build the relocation table for all files in a directory, then
save the relocation table into a table.htmlink
file
in that directory for later reuse.
There are three kinds of files:
-
input files: these are HTML files which are modified
as a consequence of linking; they have their links fixed up.
-
destination files: these are HTML files which are
read to produce relocation tables.
-
relocation files: these files are not HTML files -
they are only created and read by the linker (unless you have
a tool which can manage them), and are in a specific - very
simple - format. They are used to save relocation information
for later reuse, so that the linker can run faster. Normally,
they have a
.htmlink
extension.
Linker behaviour
The linker keeps a main relocation table, which is empty at the
beginning. When run, the linker performs the following steps:
-
the linker reads and parses all relocation files specified on
the command line, and merges the relocation tables found there
into the main relocation table.
-
the linker reads and parses all destination files specified on
the command line, and builds a relocation table for them,
merging it into the main relocation table.
-
if any input files are specified on the command line, the
linker links the files using the relocation table.
Specifying input, destination and relocation files
All command line arguments which do not begin with a hypen
(-
), and which are not the values of defaults (for
example, not the YES
in -Warn YES
,
because that is the value of the default -Warn
), are
interpreted as input files. Each destination file is specified by
using a -d
option, and each relocation file by using
a -l
option. If a directory is specified as an input
(or destination) file, the linker will recurse into the directory
and add to the list of input (or destination) files all files in
the directory (and in the directory's subdirectories, no matter
how deeply nested) which have one of the following extensions:
.html
, .HTML
, .htm
or
.HTM
. If a directory is specified as a relocation
file, the linker will add to the list of relocation files all
files in the directory which have the extension
.htmlink
. A typical invocation of the linker is as
follows:
HTMLLinker -BuildRelocationFileForDir Doc
Builds a relocation file for the documentation in the
directory Doc
. After this has been done, the
directory Doc
can be used as a -l
argument.
HTMLLinker test.html -l Doc
Links the file test.html
using the relocation file
just generated in the Doc
directory.
What is a link
A link is an anchor tag with and href
, such as
<a href="dest.html#location">
. The destination
file of the link is the file specified in the href
;
dest.html
in the example. The destination file is
ignored by the linker; the name of the link (which is everything which
follows the #
) is used to perform the linking.
Which links are fixed up
Normally, the linker will only fixup links which have the
rel
attribute set to dynamic
, as in the
following example: <a href="nicola.html"
rel="dynamic">
. In this way, you can specify in your
HTML document which links you want to be fixed up, and which you
don't want to be. You can change the type of links to be fixed up
by using the -LinksMarker
options, as in
-LinksMarker gsdoc
, which causes the linker to fixup
all links with the rel
attribute set to
gsdoc
rather than dynamic
. In certain
situations you might want to force the linker to attempt to fixup
all links; you can run the linker with the -FixupAllLinks
YES
option to cause this behaviour. As a special
exception, links which obviously are not to be fixed up, such as
links beginning with mailto: or news:, or links
without a name, are never fixed up.
How links are fixed up
When the HTML linker encounters a link which needs to be fixed up
(say <a href="dest.html#location">
), it
searches the relocation table for a destination file which
contains the location
name. If no such file is
found, the HTML linker emits a warning, and replaces the link in
the file with a link to the destination without the filename. In
the example, it would simply emit <a
href="#location">
. If the destination file is found in
the list, instead, the HTML linker replaces the link with the full
path to the destination file on disk. For example, if - according
to the relocation table, the file
/home/nicola/Doc/dest.html
contains the name
location
, the HTML linker will fixup the link to be
<a href="/home/nicola/Doc/dest.html#location">
(as a special exception, if there is a path mapping which matches
the path to the destination file, it's applied to the path in the
link. See below for a detailed explanation of path mappings).
It's important to notice that you must have unique link names for
the linker to work properly. For example, if you have two
different destination files containing the same name, say
NSObject.html
and NSString.html
both
containing the name init
, then the linker can't
resolve <a href="#init">
, because it has no way
to know if you meant the link to point to the first or the second
destination file! You should choose names better so that they
uniquely specify what they represent contents, for example
NSObject_i_init
and NSString_i_init
if
the first link is in the place documenting the -init
method of the NSObject class and the second one the one of the
NSString class. Then all links will clearly refer to one place or
the other one, and no confusion will arise. If there are multiple
destination files for a link, the linker will guess which one is
the right one, and that might not give the desired result.
How links are checked
When a link is fixed up, the linker implicitly checks that the link
is correct, because if the link name can't be found in the relocation
tables, a warning is issued.
Path mappings
Path mappings are an additional feature of the HTML linker which
can be used when exporting documentation to be served by a web
server. If you are not putting your documentation on a web server
but simply reading it from the filesystem, then you don't need the
path mappings. The issue with exporting documentation to a web
server is that you refer to files using paths which are not
necessarily the same paths where the files are on disk. For
example, suppose that you have some HTML documentation in
/opt/doc/base
and some other HTML documentation in
/opt/doc/gui
. The HTML files in the two
documentation directories refer to each other. You can run the
HTML linker and fixup all links, and we are happy. But now
suppose that you set up a web server; the web server, for example,
will serve URLs beginning with /Base
(meaning as in
requests from a browser of the form
http://www.server.org/Base
) by taking files from
/opt/doc/base
, and URLs beginning with
Gui
by taking files from /opt/doc/gui
.
To fixup the links in this case, you need path mappings. A path
mapping specifies that a certain directory on disk is to be
referred in some different way in links. In the example, you
would pass
-PathMapping '{ "/opt/doc/base"="/Base"; "/opt/doc/gui"="/Gui"; }'
to the linker.
Each path mapping maps a path on disk to a virtual
path. For example, it maps the path on disk
/opt/doc/base
to the virtual path /Base
.
Each time the linker fixes up a link, after finding the
destination file, it checks the list of path mappings. If the
path to the destination file begins with the path on disk
of one of the path mappings, then that path on disk is
replaced with the corresponding virtual path in the path
to the destination file before the path to the destination file is
written out in the link.
For example, if you have the path mapping explained above, and if
the linker is fixing up the link <a
href="hi.html#nicola">
, where the destination file is
/opt/doc/base/nicola/hi.html
, then the destination
path matches the path mapping for /opt/doc/base
, so
the path mapping is applied and the link is fixed up to be
<a href="/Base/nicola/hi.html#nicola">
rather than
<a href="/opt/doc/base/nicola/hi.html#nicola">
as it
would normally have been without the path mapping.
Specifying path mappings
On the command line
Each path mapping specifies a mapping of a path on disk to a web
server alias. The first way to specify the mappings is on the
command line, in the form of a dictionary argument to the
-PathMappings
, as in
-PathMappings '{ "/opt/doc/base"="/Base"; "/opt/doc/gui"="/Gui"; }'
where /opt/doc/base
and /opt/doc/gui
are
the paths on disk and /Base
and /Gui
are
the corresponding web server URL paths.
In a path mappings file
The other way to specify mappings is to write them into a file,
in the format of a dictionary, as, for example, in a file containing
the following lines
{
"/opt/doc/base"="/Base";
"/opt/doc/gui"="/Gui";
}
and then tell the linker to read the path mappings from that file,
by giving the filename as option to the
-PathMappingsFile
. For example, if the file
containing the mappings is called mappings
, then you need
to pass
-PathMappingsFile mappings
to the linker to have it read mappings from the file.
Command line path mappings override file path mappings
Both command line path mappings and path mappings from a file can
be used at the same time; in case of conflict, command line path
mappings override path mappings from the file.
Summary of all the options
Each of the options beginning with a single hypen (-
)
require an argument, as in
HTMLLinker Documentation -LinksMarker gsdoc -d Documentation
which sets LinksMarker
to gsdoc
. The
options might be anywhere on the command line. Options which do
not begin with a single hypen (such as --help
) do not
require an argument, as in
HTMLLinker --help
-d
Followed by a destination HTML file, or a directory containing
destination HTML files.
-l
Followed by a relocation file, or a directory containing relocation files.
-FixupAllLinks
If set to NO
(the default) only links containing the
rel
attribute set to dynamic
(or
whatever specified as LinksMarkers
)are fixed up in
the input files. If set to YES
, all links are fixed
up.
-LinksMarker
If set (and if FixupAllLinks
is NO
),
only links with the rel
attribute set to its value
are processed. By default it is set to dynamic
.
-PathMappings
If set to a dictionary, read the dictionary as path mappings. See
above for more details of path mappings.
-PathMappingsFile
If set to a string, consider it to be the name of a file; read
path mappings from that file. The file must contain the path
mappings in the form of a dictionary. See above for more details
on path mappings.
-Verbose
If set to YES
prints some more messages than if set
to NO
(the default).
--help
Prints a quick explanation of the command line syntax and exits.
--version
Prints the version and exits.
Nicola Pero
Last modified: Sun Jan 6 22:54:58 GMT 2002