Indexing sources that require login

Indexing of sources that require a login works in Sitevision Crawler exactly the same way the same way as for a standard version of Nutch with one exception.

As of version 1.1 of Sitevision Crawler, it is now possible to authenticate against systems that use form-based login where the login form lacks an ID tag.

Example of httclient-auth.xml using the above function:

   <credentials authMethod="formAuth"
                loginFormActionUrl="Value of action attribute in the login form"
         <field name="username"
         <field name="password"
         <field name="User-Agent"
                value="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1" />