Transformers

http://ryan.levering.name/webseer/transformer/html-text/1.0

This is a text ripper to extract text documents from HTML.

Parameters

none

http://ryan.levering.name/webseer/transformer/renderedhtml-text/1.0

This is a text ripper that attempts to use extra rendered information to assist in determining text visibility.

Parameters

none

http://ryan.levering.name/webseer/transformer/text-fragmentedtext/1.0

This transformers uses several heuristics to pull out text fragments from plain text, which are just a generalization of sentences.

Parameters

none