NAME
    InSilicoSpectro - Open source Perl library for proteomics

INSILICOSPECTRO PROJECT DESCRIPTION
    This is the description of the entire InSilicoSpectro project; a
    description of the InSilicoSpectro.pm module is provided hereunder.

    InSilicoSpectro is a proteomics open-source project intended to cover
    common operations in mass list file format conversions, protein sequence
    digestion, theoretical mass spectra computations, theoretical and
    experimental MS data matching, text/graphic display, peptide retention
    time predictions, etc.

    The problems of raw data processing, storage and database searching are
    not addressed by the InSilicoSpectro project. InSilicoSpectro is
    released under the LGPL license and it is available from a dedicated web
    site at http://insilicospectro.vital-it.ch.

    The general design of the modules follows the object oriented
    programming (OOP) model and most of the modules are class definitions
    actually. The module that implements most of the theoretical mass
    computation routines supports a dual OOP and procedural programming
    model. InSilicoSpectro modules make use of some Perl modules that are
    not part of the standard Perl distribution, such as
    Statistics:Regression, XML:Twig, GD, and IA:NNFlex.

    We have developed a simple and minimal hierarchy to represent protein
    sequences and peptides (as digestion product) in a way that, on the one
    hand, fits the needs of the computations we perform and, on the other
    hand, stays relatively neutral in its design. Thus it should be possible
    to combine the latter classes with existing projects at users sites,
    e.g. via multiple inheritance, or to use them as the basis of more
    sophisticated objects.

    InSilicoSpectro Perl code is documented mainly via pod and a wide
    collection of simple and focused examples. An introductory explanation
    is provided here to guide new users and give them an understanding of
    the library that should be sufficient such that pod and the examples are
    the only necessary documentation.

  Installation
  Library organization
    InSilicoSpectro modules (lib/InSilicoSpectro) are organized according to
    their function. At the more general level there is a module named
    InSilicoSpectro.pm (This one!!) that provides general functionalities
    for initializing all other modules. More specialized modules are grouped
    in three folders:

    Spectra, for mass list-related;
    InSilico, for computational modules;
    Utils, for a few utility modules.

    In addition, illustrative examples can be found in three folders:

    scripts, which contains a set of tools implemented with InSilicoSpectro
    modules;
    cgi, which contains scripts implementing a simple web-based set of
    tools;
    t, which contains test programs that are examples as well.

    Now, by considering the main topics we cover in InSilicoSpectro one
    after another, we introduce the main modules and examples the user
    should try and look at to gain autonomy with the whole library.

  Mass list file format conversion
    A general purpose conversion program, convertSpectra.pl in folder
    scripts, allows you to convert one mass list format to another. A CGIzed
    version exists in the cgi folder: cgiConvertSpectra.pl.

    convertSpectra.pl is a good starting point to see a high-level usage of
    the basic methods implemented in the underlying modules.

    InSilicoSpectro::Spectra::ExpSpectrum is the basic class for
    representing spectra, i.e. a list of peaks (namely a list of pointers to
    peaks). Peaks are represented as list of attributes such as mass,
    intensity, SN, etc. The order of the attributes in these lists is given
    by an object of class InSilicoSpectro::Spectra::PeakDescriptor. See
    t/Spectra/testExpSpectrum.pl and t/Spectra/testPeakDescriptor.pl.

    By means of classes InSilicoSpectro::Spectra::MSSpectra,
    InSilicoSpectro::Spectra::MSMSSpectra,
    InSilicoSpectro::Spectra::MSMSCmpd, and InSilicoSpectro::Spectra::MSRun
    we represent PMF (MS) and MS/MS spectra, and HPLC runs. See
    t/Spectra/testSpectra.pl.

  Utils
    The module InSilicoSpectro::Utils::IO.pm contains miscellaneous
    utilities for accessing compressed files, defining a common verbose
    variable, etc.

  pI estimations
    scripts/computePI.pl is a tool that exemplify the usage of the class
    InSilicoSpectro::InSilico::IsoelPoint. Examples of how to use it can be
    found in t/InSilico/examples_rt_pi. See also the example in
    t/InSilico/testIsoelPoint.pl. A CGI version of computePI.pl can be found
    in cgi folder.

  Retention time prediction
    scripts/computeRT.pl is a tool that exemplify the usage of the class
    InSilicoSpectro::InSilico::RetentionTimer. Examples of how to use it can
    be found in t/InSilico/examples_rt_pi. See also the examples in
    t/InSilico/testPetritis.pl and t/InSilico/testHodges.pl. A CGI version
    of computeRT.pl can be found in cgi folder.

  Enzymes
    Enzymes are modeled by class InSilicoSpectro::InSilico::CleavEnzyme. See
    t/InSilico/testCleavEnzyme.pl.

  PTMs and other modifications
    Modifications of residues are modeled by class
    InSilicoSpectro::InSilico::ModRes. See t/InSilico/testModRes.pl.

  Protein and peptide sequences
    The basic class for biological sequences is
    InSilicoSpectro::InSilico::Sequence. We then define
    InSilicoSpectro::InSilico::AASequence to represent protein sequences
    with their modifications. A class InSilicoSpectro::InSilico::Peptide is
    used for enzymatic digestion products as we need special data in this
    case that are not part of a standard protein model.

    Examples can be found in t/InSilico: testSequence.pl, testAASequence.pl,
    testPeptide.pl.

  Protein digestion and mass computations
    The main module for digestion and mass computations is
    InSilicoSpectro::InSilico::MassCalculator. Examples of digestions and
    protein/peptide mass computations, including in the presence of
    fixed/variable modifications, are found in t/InSilico:
    testCalcDigest.pl, testCalcDigestOOP.pl, and testCalcVarpept.pl. OOP
    means an example with the OOP model as MassCalculator supports both an
    OOP and procedural interface.

  PMF
    The match between theoretical peptide masses and PMF experimental data
    is made by functions found in InSilicoSpectro::InSilico::MassCalculator.
    In the OOP model it is possible to represent PMF matches in objects of
    class InSilicoSpectro::InSilico::PMFMatch. See
    t/InSilico/testCalcPMFMatch.pl and t/InSilico/testCalcPMFMatchOOP.pl.

  Peptide fragmentation
    Theoretical fragment masses are computed by functions found in
    InSilicoSpectro::InSilico::MassCalculator. In the OOP model, theoretical
    MS/MS spectra can be represented as an object of class
    InSilicoSpectro::InSilico::MSMSTheoSpectrum, which represents in turn
    the various ions as InSilicoSpectro::InSilico::InternIonSeries and
    InSilicoSpectro::InSilico::TermIonSeries.

    The match between experimental and theoretical masses is also computed
    by InSilicoSpectro::InSilico::MassCalculator and in the OOP model the
    class InSilicoSpectro::InSilico::MSMSTheoSpectrum can store the match in
    addition to the theoretical spectrum.

    See in t/InSilico: testCalcFrag.pl, testCalcFragOOP.pl,
    testCalcMatch.pl, testCalcMatchOOP.pl, getIonIntensities.pl, ionStat.R.

  Graphical display of MS/MS spectra/matches
    The class InSilicoSpectro::InSilico::MSMSOutput instanciates objects
    aimed at providing different formats in order to represent MS/MS spectra
    and matches. See in t/InSilico: testMSMSOutText.pl, testMSMSOutLatex.pl,
    testMSMSOutHtml.pl, testMSMSOutPlot.pl, testMSMSOutLegend.pl.

  Mini web site
    In folder miniweb we provide a perl script build-miniweb.pl that builds,
    from CGI scripts in folder cgi, a simple web site for protein digestion,
    mass computations, and pI and retention time estimations.

MODULE DESCRIPTION
    The module InSilicoSpectro.pm comprises generic functions that are
    useful for the whole project.

FUNCTIONS
   saveInSilicoDef([$out])
    Saves all registered definitions into the configuration file named $out,
    e.g. insilicodef.xml

   getInSilicoDefFiles()
    Returns the list of configuration files given by the operating system
    environment variable, whose name is stored in
    $InSilicoSpectro::DEF_FILENAME_ENV (default "INSILICOSPECTRO_DEFFILE").

    The environment variable can point more than one file (separated by
    ':'), or be a glob ('...*...' expression).

   init([@files])
    Loads a list of configuration files given as parameter or the default
    configuration files as returned by getInSilicoDefFiles.

SEE ALSO
    InSilicoSpectro::InSilico, InSilicoSpectro::Spectra,
    InSilicoSpectro::Utils

COPYRIGHT
    Copyright (C) 2004-2005 Geneva Bioinformatics (www.genebio.com) &
    Jacques Colinge (Upper Austria University of Applied Science at
    Hagenberg)

    This library is free software; you can redistribute it and/or modify it
    under the terms of the GNU Lesser General Public License as published by
    the Free Software Foundation; either version 2.1 of the License, or (at
    your option) any later version.

    This library is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser
    General Public License for more details.

    You should have received a copy of the GNU Lesser General Public License
    along with this library; if not, write to the Free Software Foundation,
    Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

AUTHORS
    Jacques Colinge, www.fhs-hagenberg.ac.at

    Alexandre Masselot, www.genebio.com