Legal and Ethical
Legal and Ethical issues
There are many Legal and Ethical issues surrounding Web Scraping or Web Harvesting and the boundaries are not always clear. Some web sites have a Terms Of Use position and a data extract may conflict with those terms of use. Some web sites take technical measures to stop the html parse process by blocking an IP address or monitoring traffic levels. Even if a web site has posted terms of use, the enforceability of those terms is a muddy issue and remains unclear. Portal Sites that do systematic crawling and indexing have been typically allowed to go about their business as they tend to be a complement to the sites being harvested. Also there has been a recent court ruling that duplication of facts is allowable.
There has also been some efforts to use trespassing laws against bots or crawlers, but they have often been unable to meet the burden of showing that the crawler interfered or caused damage.
For more information on the legal and ethical issues see Wikipedia.