Next:
Contents
Contents
Documentation for the Combine (focused) crawling system
Anders Ardö
Contents
Overview
Introduction
Open Source distribution, Installation
Installation
Getting started
Detailed documentation
Use scenarios
Configuration
Configuration files
Crawler operation
URL selection criteria
Document parsing
URL filtering
Link selection/scheduling policy
Built in topic filter - automated subject classification
Topic filter Plug-In API
Analysis
URL recycling
Complete application - SearchEngine in a Box
Evaluation of automated subject classification
Performance
System components
combineINIT
combineCtrl
combineUtil
combineExport
Internal executables and Library modules
Bibliography
Gory details
Frequently asked questions
Configuration variables
Name/value configuration variables
Complex configuration variables
Module dependences
Programs
Library modules
External modules
APPENDIX
Simple installation test
Example topic filter plug in
Default configuration files
SQL database
Manual pages
About this document ...
root 2006-11-08