We have something like
Quote:
*?=productcode&=1
/*.html?js=*
/*?=productcode&=1
/?
/?*
/?=productcode&=1
/?sort=*
/admin/
/antibot_image.php
/cart.php
/catalog/
/error_message.php
/files/
/giftcert.php
/giftreg_manage.php
/giftregs.php
/help.php
/home.php?cat=*
/icon.php
/image.php
/include/
/modules/
/offers.php
/orders.php
/payment/
/product.php
/product.php*
/product_image.php
/register.php
/search.php
/shop_closed.html
/sql/
/upgrade/
/var/
?*
|
Some are probably duplicates and some are not necessary but we figure this works so far

This includes more than what is in our robots.txt.
It takes about less than 5 minutes (2-3 minutes) for it to go through and process (generated 300ish links). Though you may want to limit it on a shared hosting since it can be pretty intense on your website. Also when your host locks your site for CPU overuse, GSiteCrawler doesn't realize that and continues to crawl.