Creating a robots.txt
A robots.txt file is quite simply a standard text file which can be created with any text editor like notepad and saved with the .txt extension. Upload a robots.txt into the root of your website so it can be found by search engines here - http://www.domain.com/robots.txt.
Denying bots from indexing using a robots.txt file
To deny bots from an entire website:
User-agent: * Disallow: /
To deny all bots from indexing a specific page:
User-agent: * Disallow: /page.html
To deny all bots from indexing a folder you can use the following:
User-agent: * Disallow: /folder/
To deny all bots from indexing any URL containing 'monkey' by using a wildcard:
User-agent: * Disallow: /*monkey
To deny dynamic URLs which contain a '?' use the same method - again a wildcard:
User-agent: * Disallow: /*?
To specify which bot you want to block you can change the User Agent. To deny Googlebot:
User-agent: Googlebot Disallow: /page.html Disallow: /folder/ Disallow: /*monkey Disallow: /*?
How To remove a page from Google that has been indexed
Google supports the noindex directive, so if you specify a page using the Noindex directive within a robots.txt you can then login to Google Webmaster Tools go to Site Configuration > Crawler Access > Remove URL and ask them to remove it.
User-agent: Googlebot Noindex: /page.html
When to not use a robots.txt
Just like you can access a web page though a browser, anyone can look at your robots.txt file so it's important you don't use a robots.txt to block a private page or a page that hasn't even been linked to from your website (in that case the bot wont be able to find it anyway).
The other issue is not all dynamic URLs have a pattern that will allow them to be easily blocked by a robots.txt, for that you can use another method to deliver the robots directive by setting an X-Robots Tag.
Using X-Robots Tag
Setting a X-Robots Tag is the more discreet way of blocking a URL. You can test your page header using this HTTP Request and Response Header Tool.
With PHP you can tell bots to not index, archive, show a snippet or 'nofollow' the links on the page :
header("X-Robots-Tag: noindex, nofollow, noarchive, nosnippet", true);
Using a htaccess file you can do the same using FilesMatch:
<FilesMatch "page\.html"> Header set X-Robots-Tag "noindex, noarchive, nosnippet" </FilesMatch>