NAME

swconv - ``Shared-Web'' converter.


SYNOPSIS

swconv [options] directory


DESCRIPTION

swconv was designed for use at Marathon Oil Company (http://www.marathon.com/). (The name ``swconv'' is historical only.)

swconv is a utility to make moving large directory trees of web content from one virtual server (or directory) to another easier.

Every once in a while, the need arises to move web content from on virtual server to another. When that happens, the publisher or webmaster is left with the daunting task of changing/fixing/updating lots of hyperlinks that broke as a result of the move.

Unfortunately, the process of fixing those links is quite tedious and error prone when attempted manually. One must be sure to check (and possibly update references on the old an new virtual server. One must check all places where there might have been links to the moved content, the moved content itself may have broken links to content that used to be ``relative'' (because it was on the old virtual server), and so on.

swconv makes this process easier. It is designed to be run after the web content has been moved from the old virtual server to the new one and before any manual attempts to update the links have been attempted.

swconv scans for links embedded in these tags:

   <IMG...>
   <A...>
   <LINK...>
   <FORM...>
   <INPUT...>

You will need to give swconv a fair amount of information about your setup. This can be accompished either with command-line options (see ``OPTIONS'' below) or via a configuration file specified with the ``--config'' option (not yet implemented).

One may simply invoke swconv with some command-line options (see ``OPTIONS'' below) to customize its behavior and the path to the directory which needs to be processed.

After reading through the available options, you should read the ``EXAMPLES'' section for ideas of how to use swconv when.


OPTIONS

wclogrot can take a combination of many command-line options. All options begin with a double-dash (--). Some take mandatory parameters, while others are simple flags to enable a certain option.

The available options are:

--logfile ``/path/to/file''

NOT IMPLEMENTED YET

The location and file in which you'd like swconv to put messages. If a log file is specified, nothing will be printed to STDOUT unless a problem creeps up. If you specify the ``--verbose'' flag, it will increase the verbosity of the messgaes put into this file.

--debug

Enable the output of debugging messages to STDOUT. This is really only useful for development or if you are trying to track down a problem. The debugging messages should give you a good idea of what is happening as swconv goes through its motions. Debugging is disabled by default. Debugging is not automatically redirected to a log file when the ``--logfile'' option is used. You must redirect debugging messages from the shell if you wish to capture them.

--backup

Enable the creation of backup files. Every file that is modified will be backed up by appending a backup suffix to the full name of hte file (see the ``--backup-suffix'' option, below). Backups are disabled by default.

--backup-suffix "blah"

NOT IMPLEMENTED YET

Use this as the backup suffix. (See the ``--backup'' option, above) for details about what this does. The default value of backup-suffix is ``.bak''

--test

Tell swconv to run in test mode. It will go through all the motions, respecting other swtiches you've set, but when it comes to commit the changes to disk, it will not rewrite any files. This is useful for pre-scanning directories to see what the impact of running swconv will be.

Please use this flag when you're testing things. Depending on the ``--backup'' option isn't always what you want to do.

--old-url-prefix ``http://www.some.thing/foo/''

Required

Tell swconv the URL prefix of the old content location. This defaults to null, so is required that you set this option.

--new-url-prefix ``http://www.new.thing/bar/''

Required

Tell swconv the URL prefix of the new content location. This defaults to null, so it is required that you set this option.

--dir-is-source

Tell swconv that the directory you've specified is the source (meaning that it may contain links to the moved content). Source is where links start. This cannot be set if ``dir-is-destination''is set or an error will result.

--dir-is-destination

Tell swconv that the directory you've specified is the destination (meaning that it contains the moved content). Destination is where the links will go. This cannot be set if ``dir-is-source'' is set or an error will result.

--help

Display a brief help message to remind you of command-line arguments. This won't happen by default.

directory

The directory name/path that you specify may contain either ``/'' or ``\'' as path separators (Unix or DOS style). They will be converted properly internally.


EXAMPLES

Following are several examples of using swconv in different situations. In cases when the command-line arguments are quite long, lines are broken up with trailig backslahes (``\'').


Example #1

You have moved content from /www/companyX/products to /www/companyZ/products, which are hosted as http://www.companyX.com/products and http://www.companyZ.com/products respectively.

You move the content and remove the /www/companyX/products directory.

There may be links in /www/companyX/about_us which refer to pages in what is now /www/companyZ/products, but the links don't know that.

To update any links in /www/companyX/about_us, do the following:

  cd /www/companyX
  swconv --dir-is-source \
         --old-url-prefix "http://www.companyX.com/products/"; \
         --new-url-prefix "http://www.companyZ.com/products/"; \
         --backup about_us

There may also be links in /www/comapnyZ/products which refer to pages in what is now on a different virtual server, the stuff still sitting in /www/companyX/.

To adjust those links, do the following:

  cd /www/companyZ
  swconv --dir-is-destination \
         --old-url-prefix "http://www.companyX.com/products/"; \
         --new-url-prefix "http://www.companyZ.com/products/"; \
         --backup products

Both of those sets of commands will produce backup files in those directories. Only files which change are backed up.


Example #2

This needs to be a more clever example of using swconv usage.

It's a bit too late in the evening for me to come up with one, so let me know if you think of anything.


BUGS

Note: swconv makes fairly extensive attempts to trap most trappable errors. In the event of a problem, it will print a failure notice and exit with a non-zero exit code.

I should add a ``--sleep-between-files'' option that can call sleep() between processing files. This will make it less of a CPU hog at the expense of the process taking a bit longer.

swconv can only process one directory tree at a time. You may need to write a small script to have it hit lots of peer directory trees (util I add the necessary code to handle multiple directory names on the command-line).

Also, this may or may not be a bug, depending on your point of view. swconv does not parse the HTML file(s). It simply applies some clever regexes to the file based on the command-line arguments you supply. It does not know about embedded code such as JavaScript. Commented chunks of HTML may get adjusted when you don't want them to.


AUTHOR AND COPYRIGHT

Copyright 1998, Jeremy D. Zawodny

swconv may be used, copied, and re-distributed under the same terms and conditions as Perl.


VERSION

Documentation: $Revision: 1.11 $