Query module


The Querying module
Developer : Calvin Pedzai

The querying module of the search engine is responsible for the receiving of queries and returning the most relevant results back to a user. A user enters a query through the search engine interface in order to search for desired information on a particular subject. The query is treated as a string of keywords which is used by the search engine to find the most relevant documents to the user’s request.

The aims of the module are to

 System Overview


Dynamic Allocation

Computing the load balance on each node and comparing the load averages for both querying and indexing subsystems helps to determine how to dynamically allocate the nodes of the cluster to indexing and querying jobs. If the search engine is processing an increasing number of queries, the node allocation within the cluster can be revised in order to balance the work load evenly across machines.


Perhaps the most crucial aspect of the search engine, in terms of the speed of data retrieval, is the cache. Since the speed at which a query is processed depends on the communication speed between nodes, caching reduces the amount of files that need to be copied across to a worker node from the Dispatcher. If the file already exists in a local directory on the node, this file may be used instead of a file from a remote directory. This reduces communication time in sending files around the cluster and improves the overall performance of the system.



The dynamic allocation of cluster nodes results in high performance for handling queries from users. There is better utilisation of the cluster due to the load balancing done by the Dispatcher. Caching improves the performance of the querying module and the availability of processors in the cluster results in a scalable system.


Copyright 2006 Ndapandula Nakashole & Calvin Pedzai. All rights reserved.Last update :10 November 2006