ISBN-13: 9786204984025 / Angielski / Miękka / 224 str.
The size of Internet is large and it had grown enormously, search engines are the tools for web site navigation and search. Search engines maintain indices for web documents and provides search facilities by continuously downloading web pages for processing. This process of downloading web pages is known as web crawling. In this book Architecture for effective migrating parallel web crawling approach with domain specific and incremental crawling strategy is proposed. The major advantages of parallel web crawler are that the analysis portion of the crawling process is done locally at the residence of data rather than inside the web search engine repository. This significantly reduces network load and traffic which in turn improves the performance, effectiveness and efficiency of the crawling process. The another advantage of migrating parallel crawler is that the size of the web grows, it becomes necessary to parallelize a crawling process, in order to finish downloading web pages in comparatively shorter time. Domain specific crawling will yield high quality pages.