I need to integrate Lucene search capabilities to my journal management system. As you know, Lucene has PHP port but it is not enough to develop sufficient IR systems. I think, some core components ara missing on Php side
I want to develope a jar file gets some parameters (file path, language, status (insert, update or delete), journalid, dc contents etc.) from console. First, jar file will use Apache Tika (text-extraction library consists of many text and metadata extraction libraries, such as pdfbox etc.) for text extraction from various media formats. Then, will entegrate hard stemming algorithms for various languages. This jar file also will create and update Lucene index. Your programming language is not important just create an index and search this index with other Lucene ports. I want to use Lucene Php port on search side (again, need to use jar file for stemming search words/tokens on Php side).
But I have some problems. If i develope core IR system wiht java, some of application hosting organizations (for example some of universities) will ban exec function or jar execution because of security reasons! So, i will write once, but not run everywhere
Thanks time and consideration,
