OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Searching across the harvested archives

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
The Public Knowledge Project Support Forum is moving to http://forum.pkp.sfu.ca

This forum will be maintained permanently as an archived historical resource, but all new questions should be added to the new forum. Questions will no longer be monitored on this old forum after March 30, 2015.

Searching across the harvested archives

Postby prcgian » Fri Oct 13, 2006 6:35 am

Hi, How is performed the searching across the harvested archives?
what is tha algorithm? what is the rank function?

If I use MySql the Harvester2 uses its searching functions?? (i.e. fulltext search)
prcgian
 
Posts: 3
Joined: Fri Oct 13, 2006 5:10 am

Postby asmecher » Fri Oct 13, 2006 6:56 am

Hi prcgian,

Searching and indexing are implemented in classes/search/*.php using an inverted index. The keywords and indexing information are stored in the MySQL tables called search_keyword_list, search_objects, and search_object_keywords.

The algorithm works very roughly as follows: the search string is split into keywords (with a quoted phrase being treated as a single "keyword"). The numbers of results for each keyword (with a maximum number as defined in the configuration file) are added and the final ranking is calculated based on those totals.

The search algorithm itself can be found in classes/search/Search.inc.php in the retrieveResults function.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm

Ok

Postby prcgian » Fri Oct 13, 2006 10:50 am

Thank you, your information are very useful!
prcgian
 
Posts: 3
Joined: Fri Oct 13, 2006 5:10 am


Return to Open Harvester Systems Support and Development

Who is online

Users browsing this forum: Bing [Bot] and 0 guests