Package: gskb
Type: Package
Title: Gene Set data for pathway analysis in mouse
Version: 1.10.0
Date: 2015-03-18
Author: Valerie Bares, Xijin Ge
Maintainer: Valerie Bares <valerie.bares@sdstate.edu>
Description: Gene Set Knowledgebase (GSKB) is a comprehensive knowledgebase
        for pathway analysis in mouse. Interpretation of high-throughput 
        genomics data based on biological pathways constitutes a constant 
        challenge, partly because of the lack of supporting pathway database. 
        We created a functional genomics knowledgebase in mouse, which includes
        33,261 pathways and gene sets compiled from 40 sources such as Gene 
        Ontology, KEGG, GeneSetDB, PANTHER, microRNA and transcription factor 
        target genes, etc.  In addition, we also manually collected and curated
        8,747 lists of differentially expressed genes from 2,526 published gene
        expression studies to enable the detection of similarity to previously
        reported gene expression signatures. These two types of data constitute
        a comprehensive Gene Set Knowledgebase (GSKB), which can be readily 
        used by various pathway analysis software such as gene set enrichment
        analysis (GSEA).
        As a first step, we gathered annotation information from 40 existing 
        databases for mouse-related gene sets. These gene sets are divided into
        7 categories, namely,  Gene Ontology, Curated pathways, Metabolic 
        Pathways, Transcription Factor (TF) and microRNA target genes, location
        (cytogenetics band), and others. We used information in GeneSetDB for 
        some of the databases. Detailed information on these 40 sources and the
        citations is available
        http://ge-lab.org/gskb/Table%201-sources.pdf .
        The gene lists from literature were retrieved manually from individual
        gene expression studies through a process similar to the one used to 
        create AraPath, a similar resource for Arabidopsis[12]. As most
        expression studies upload raw data to repositories like GEO and 
        ArrayExpress, we used the meta-data in these databases to search for 
        publications. We scanned all datasets we can found and retrieved 4,313
        potentially useful papers reporting gene expression studies in mouse. 
        These papers were individually read by curators to identify lists of 
        differentially expressed genes in various conditions. We compiled a 
        total of 8,747 lists of differently expressed genes from 2,518 of 
        papers. Each gene list was annotated with a unique name, brief
        description, and publication information, similar to the protocol used
        in MSigDB and Arapath.  These gene lists constitute a large collection
        of published gene expression signatures that form a foundation for 
        interpret new gene lists and expression profiles. 
        More information about this data is available here
        http://ge-lab.org/gskb/. There is also a paper 
        describing these data are currently in revision by Database: The 
        Journal of Biological Databases and Curation.
License: Artistic-2.0
biocViews: ExperimentData, Mus_musculus
Depends: R (>= 3.2.0)
Suggests: PGSEA
NeedsCompilation: no
Packaged: 2017-10-31 16:30:46 UTC; biocbuild
