webseer: html analysis library

users guide


Table of Contents

Who is this for?
I. about:webseer
1. Introduction
2. Fundamental concepts
webseer Architectural Constructs
Fetcher
Generator
Transformer
FeatureExtractor
Identity
II. webseer as a tool
3. Quick Example
4. Installation and Setup
5. Usage Tutorial
III. webseer as a library
6. Quick Example
7. Installation and Setup
8. API tutorial
IV. Extending webseer
9. Adding a Feature
V. Reference
10. Included Models
http://ryan.levering.name/webseer/model/html/1.0
http://ryan.levering.name/webseer/model/rendered-html/1.0
http://ryan.levering.name/webseer/model/text/1.0
http://ryan.levering.name/webseer/model/fragmented-text/1.0
http://ryan.levering.name/webseer/model/parsed-text/1.0
11. Included Algorithms
Fetchers
http://ryan.levering.name/webseer/fetcher/nutch-http/1.0
Generators
http://ryan.levering.name/webseer/generator/html-tidy/1.0
http://ryan.levering.name/webseer/generator/html-nekohtml/1.0
http://ryan.levering.name/webseer/generator/renderedhtml-xpcom/1.0
http://ryan.levering.name/webseer/generator/text/1.0
Transformers
http://ryan.levering.name/webseer/transformer/html-text/1.0
http://ryan.levering.name/webseer/transformer/renderedhtml-text/1.0
http://ryan.levering.name/webseer/transformer/text-fragmentedtext/1.0
Included FeatureExtractors
http://ryan.levering.name/webseer/extractor/text-basic/1.0
12. Command Line Tools
extract
Syntax
Description