Jianhua,
  I think that we need to try and sort out what is/ and is not available
through KEGG. I have started to write the package, but I would like you to
do a lot of the basic implementation and documentation.
  Most of the types etc are available from the KEGG web page and we do need 
a workable version of this by the end of the first week in August as I would
like to have some of it in the book.

  As near as I can tell we are going to need some sort of listing of the 
three letter species abbreviations that they are using - so that one
can request pathway data for a specific species. And I am not sure how that 
impacts on the data in KEGG - because somehow I think what we have there
now is not right.

The environments: KEGGPATHID2EXTID and KEGGEXTID2PATHID cannot be right,
 and I am guessing that we need to leave those to some sort of organism 
specific data base or to the KEGGSOAP package.


For example: most of the get_xxx functions will return objects with these
slots. I don't think we need an S4 class but we should document what 
will happen.

RETURN TYPES:

genes_id1         genes_id of the query (string)
genes_id2         genes_id of the target (string)
sw_score          Smith-Waterman score between genes_id1 and genes_id2 (int)
bit_score         bit score between genes_id1 and genes_id2 (float)
identity          identity between genes_id1 and genes_id2 (float)
overlap           overlap length between genes_id1 and genes_id2 (int)
start_position1   start position of the alignment in genes_id1 (int)
end_position1     end position of the alignment in genes_id1 (int)
start_position2   start position of the alignment in genes_id2 (int)
end_position2     end position of the alignment in genes_id2 (int)
best_flag_1to2    best-best flag from genes_id1 to genes_id2 (boolean)
best_flag_2to1    best-best flag from genes_id2 to genes_id1 (boolean)
definition1       definition string of the genes_id1 (string)
definition2       definition string of the genes_id2 (string)
length1           amino acid length of the genes_id1 (int)
length2           amino acid length of the genes_id2 (int)

Also, I found a lot of examples at:
http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioruby/lib/bio/io/keggapi.rb?cvsroot=bioruby&rev=1.5
(just search on something like get_motifs_by_gene at Google and this comes
up. It seems to have some sensible values.

Here are some examples that I have got to go, together with my simple 
package example. Which is on rna in
~rgentlem/Rpacks/KEGGSOAP
and the SSOAP that I have done some surgery on is there as well.


g1 = get_best_neighbors_by_gene("eco:b0002",1, 5)

sapply(g1, function(x) x$genes_id2)

g2 = get_genes_by_pathway("path:hsa00020")

g3 = get_genes_by_pathway("path:eco00020")

g4 = get_genes_by_pathway("path:ddi00020")

dbs = list_databases()

xx=get_motifs_by_gene("eco:b0002", "pfam")
