Table of Contents
Extracts features from urls to data files.
Options:
specify a file to write the features to. defaults to writing to console.
filter the list of features gathered from the input URLs. defaults to using all features.
change the feature output format. defaults to line features to console, comma to file.
uses the input urls to run a breadth-first crawl to collect statistics. depth is how many levels deep the crawl is, defaults to 1. filter is a regex expression to constrain the urls beyond the default crawler settings.