getBestNeighbors {KEGGSOAP} | R Documentation |
Given a KEGG gene id, the functions query the KEGG Sequence Similarity Database (SSDB) for genes that are homologous to the target gene in other organisms. Genes that share an arbitrary threshold level of similarity determined by alignment of matching bases are termed homologous.
get.best.best.neighbors.by.gene(genes.id, start, max.results) get.best.neighbors.by.gene(genes.id, start, max.results) getBestNeighbors(genes.id, start, max.results, what = c("best", "best_best"))
genes.id |
genes.id a character string for the id used by
KEGG to represent the gene of interest. The id normally consists of
three letters followed by a colon and then several numbers. The
three letters are from the first letter of the genus
name and the first two letters of the species name of the scientific
name of the organism of concern (e. g. hsa:111 for Homo Sapiens) |
start |
start an integer to indicate the location of the
entry in the query results from which the results will be
extracted and returned |
max.results |
max.results an integer to indicate the
maximum number of entries that will be extracted from the query
results and returned |
what |
what a character string that can either be "best"
or "best_best" to indicate whether reciprocal homologous genes are
sought |
A given gene may have several homologous genes across organisms. A
query to SSDB will have a list of genes that are homologous to the
target gene. start
and max.results
indicate where on the
list to start and stop to extract data and return the results.
getBestNeighbors
is a general function that queries the
SSDB database and gets the results based on whether the query is for
best or best best homologous relationships.
The functions return a list of lists. Each sub-list contains data for a gene that is homologous to the target gene with the following elements:
genes_id1 |
a character string for the id of the target gene used to query for hologous genes |
genes_id2 |
a character string for the id of the homologous gene found in another organism |
sw_score |
an integer for Smith-Waterman score between genes_id1 and genes_id2 |
bit_score |
a numeric value for the bit score between genes_id1 and genes_id2 |
identity |
a numeric value between 0 and 1 for the degree of identity between genes_id1 and genes_id2 |
overlap |
an integer for the overlapping length between genes_id1 and genes_id2 |
start_position1 |
an integer for the start position of the alignment in genes_id1 |
end_position1 |
an integer for the end position of the alignment in genes_id1 |
start_position2 |
an integer for the start position of the alignment in genes_id2 |
end_position2 |
an integer for the end position of the alignment in genes_id2 |
best_flag_1to2 |
a boolean that is TRUE if genes_id2 is the best neighbor gene of genes_id1 |
best_flag_2to1 |
a boolean that is TRUE if genes_id1 is also the best neighbor gene of genes_id2 |
definition1 |
a character string for the definition of genes_id1 |
definition2 |
a character string for the definition of genes_id2 |
length1 |
an integer for the amino acid length of the genes_id1 |
length2 |
an integer for the amino acid length of the genes_id2 |
Jianhua Zhang
http://www.genome.jp/kegg/soap/doc/keggapi_manual.html
if(require("SSOAP") && require("XML")){ bestGenes <- get.best.neighbors.by.gene("eco:b0002",1, 5) bestBestGenes <- get.best.best.neighbors.by.gene("eco:b0002",1, 5) }