I'm struggling with the Lucene plugin for a few weeks and am now nearly gone mad. OK, I've been warned by the README file that getting remote server mode running is a tricky thing. My situation is aggravated by the fact that we had to book a SolR server package that we don't have configuration admin access to by ourselves (we can access a bit of admin diagnosis tools, though). I've sent the server setup instructions to the admin and have been informed that they were applied (well, that was actually not one step, but they should be so by now). I've been particularly ensured that the DataImportHandler is configured the way it was described in the README. However, when running the rebuildSearchIndex.php script from /tools, I don't get anything indexed in Lucene.
Here's the process the way I understand it:
- The Lucene plugin builds the XML streams to be POST-ed to the SolR server and it seems to do that correctly (I've put one sample XML-post that I got out of the process with the help of echo [url src=http://www.reinhardt-journals.de/public/Sample_SolR_POST_Doc.xml]here[/url]).
- The POST is sent to the correct URL of the DataImportHandler.
- The SolR server returns HTTP code 200.
- However, the $result variable assigned in the line
- Code: Select all
$result = $this->_makeRequest($url, $articleXml, 'POST');
- The number of indexed files, as counted by $numIndexed, is always 0, which is also output to the command shell: ... 0 articles indexed.
- The troubleshooting part in the README says I should check the luke record. The luke records says the same as $numIndexed:
- Code: Select all
<lst name="index">
<int name="numDocs">0</int>
<int name="maxDoc">0</int>
<int name="numTerms">0</int>
...
</lst>
I can get some statistics about the DataImportHandler's performance. They look like this:
Status : IDLE
Documents Processed : 0
Requests made to DataSource : 0
Rows Fetched : 2
Documents Deleted : 0
Documents Skipped : 0
Total Documents Processed : 0
Total Requests made to DataSource : 0
Total Rows Fetched : 0
Total Documents Deleted : 0
Total Documents Skipped : 0
handlerStart : 1353407720923
requests : 156
errors : 0
timeouts : 0
totalTime : 141659
avgTimePerRequest : 908.0705
avgRequestsPerSecond : 1.9848118E-4
I can also see statistics of the update requests that are triggered by every run of rebuildSearchIndex to clear the index (I guess):
commits : 18
autocommits : 0
optimizes : 0
rollbacks : 155
expungeDeletes : 0
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 0
cumulative_deletesById : 0
cumulative_deletesByQuery : 18
cumulative_errors : 0
Why does the SolR server send a code 200 but does not index the articles?
Best regards,
Kai Weber
