- High risk flaws (potentially leading to system compromise):
- Server-side SQL injection (including blind vectors, numerical parameters).
- Explicit SQL-like syntax in GET or POST parameters.
- Server-side shell command injection (including blind vectors).
- Server-side XML / XPath injection (including blind vectors).
- Format string vulnerabilities.
- Integer overflow vulnerabilities.
- Locations accepting HTTP PUT.
- Medium risk flaws (potentially leading to data compromise):
- Stored and reflected XSS vectors in document body (minimal JS XSS support present).
- Stored and reflected XSS vectors via HTTP redirects.
- Stored and reflected XSS vectors via HTTP header splitting.
- Directory traversal (including constrained vectors).
- Assorted file POIs (server-side sources, configs, etc).
- Attacker-supplied script and CSS inclusion vectors (stored and reflected).
- External untrusted script and CSS inclusion vectors.
- Mixed content problems on script and CSS resources (optional).
- Incorrect or missing MIME types on renderables.
- Generic MIME types on renderables.
- Incorrect or missing charsets on renderables.
- Conflicting MIME / charset info on renderables.
- Bad caching directives on cookie setting responses.
- Low risk issues (limited impact or low specificity):
- Directory listing bypass vectors.
- Redirection to attacker-supplied URLs (stored and reflected).
- Attacker-supplied embedded content (stored and reflected).
- External untrusted embedded content.
- Mixed content on non-scriptable subresources (optional).
- HTTP credentials in URLs.
- Expired or not-yet-valid SSL certificates.
- HTML forms with no XSRF protection.
- Self-signed SSL certificates.
- SSL certificate host name mismatches.
- Bad caching directives on less sensitive content.
- Internal warnings:
- Failed resource fetch attempts.
- Exceeded crawl limits.
- Failed 404 behavior checks.
- IPS filtering detected.
- Unexpected response variations.
- Seemingly misclassified crawl nodes.
- Non-specific informational entries:
- General SSL certificate information.
- Significantly changing HTTP cookies.
- Changing Server, Via, or X-... headers.
- New 404 signatures.
- Resources that cannot be accessed.
- Resources requiring HTTP authentication.
- Broken links.
- Server errors.
- All external links not classified otherwise (optional).
- All external e-mails (optional).
- All external URL redirectors (optional).
- Links to unknown protocols.
- Form fields that could not be autocompleted.
- Password entry forms (for external brute-force).
- File upload forms.
- Other HTML forms (not classified otherwise).
- Numerical file names (for external brute-force).
- User-supplied links otherwise rendered on a page.
- Incorrect or missing MIME type on less significant content.
- Generic MIME type on less significant content.
- Incorrect or missing charset on less significant content.
- Conflicting MIME / charset information on less significant content.
- OGNL-like parameter passing conventions.
Along with a list of identified issues, skipfish also provides summary overviews of document types and issue types found; and an interactive sitemap, with nodes discovered through brute-force denoted in a distinctive way.
How to run the scanner?
Once you have the dictionary selected, you can try:
$ ./skipfish -o output_dir http://www.example.com/some/starting/path.txt
Note that you can provide more than one starting URL if so desired; all of them will be crawled.
Some sites may require authentication; for simple HTTP credentials, you can try:
$ ./skipfish -A user:pass ...other parameters...
Alternatively, if the site relies on HTTP cookies instead, log in in your browser or using a simple curl script, and then provide skipfish with a session cookie:
$ ./skipfish -C name=val ...other parameters...
Other session cookies may be passed the same way, one per each -C option.
Certain URLs on the site may log out your session; you can combat this in two ways: by using the -N option, which causes the scanner to reject attempts to set or delete cookies; or with the -X parameter, which prevents matching URLs from being fetched:
$ ./skipfish -X /logout/logout.aspx ...other parameters...
The -X option is also useful for speeding up your scans by excluding /icons/, /doc/, /manuals/, and other standard, mundane locations along these lines. In general, you can use -X, plus -I (only spider URLs matching a substring) and -S (ignore links on pages where a substring appears in response body) to limit the scope of a scan any way you like - including restricting it only to a specific protocol and port:
$ ./skipfish -I http://example.com:1234/ ...other parameters...
Another useful scoping option is -D - allowing you to specify additional hosts or domains to consider in-scope for the test. By default, all hosts appearing in the command-line URLs are added to the list - but you can use -D to broaden these rules, for example:
$ ./skipfish -D test2.example.com -o output-dir http://test1.example.com/
...or, for a domain wildcard match, use:
$ ./skipfish -D .example.com -o output-dir http://test1.example.com/
In some cases, you do not want to actually crawl a third-party domain, but you trust the owner of that domain enough not to worry about cross-domain content inclusion from that location. To suppress warnings, you can use the -B option, for example:
$ ./skipfish -B .google-analytics.com -B .googleapis.com ...other parameters...
By default, skipfish sends minimalistic HTTP headers to reduce the amount of data exchanged over the wire; some sites examine User-Agent strings or header ordering to reject unsupported clients, however. In such a case, you can use -b ie or -b ffox to mimic one of the two popular browsers.
But seriously, how to run it?
A standard, authenticated scan of a well-designed and self-contained site (warns about all external links, e-mails, mixed content, and caching header issues):
$ ./skipfish -MEU -C "AuthCookie=value" -X /logout.aspx -o output_dir http://www.example.com/
Five-connection crawl, but no brute-force; pretending to be MSIE and caring less about ambiguous MIME or character set mismatches:
$ ./skipfish -m 5 -LVJ -W /dev/null -o output_dir -b ie http://www.example.com/
Brute force only (no HTML link extraction), trusting links within example.com and timing out after 5 seconds:
$ ./skipfish -B .example.com -O -o output_dir -t 5 http://www.example.com/
For a short list of all command-line options, try ./skipfish -h.
Oy! Something went horribly wrong!
There is no web crawler so good that there wouldn't be a web framework to one day set it on fire. If you encounter what appears to be bad behavior (e.g., a scan that takes forever and generates too many requests, completely bogus nodes in scan output, or outright crashes), please first check our known issues page. If you can't find a satisfactory answer there, recompile the scanner with:
$ make clean debug
...and re-run it this way:
$ ./skipfish [...previous options...] 2>logfile.txt
You can then inspect logfile.txt to get an idea what went wrong; if it looks like a scanner problem, please scrub any sensitive information from the log file and send it to the author.
If the scanner crashed, please recompile it as indicated above, and then type:
$ ulimit -c unlimited $ ./skipfish [...previous options...] 2>logfile.txt $ gdb --batch -ex back ./skipfish core
...and be sure to send the author the output of that last command as well.
원문 : http://code.google.com/p/skipfish/
댓글