opensearchserver  1.5.14.src
About: Open Search Server is both a modern search engine and a suite of full text search algorithms (based on Apache Lucene and Apache Tomcat). Sources.
  Fossies Dox: opensearchserver-1.5.14.src.tar.gz  ("inofficial" and yet experimental doxygen-generated source code documentation)  

opensearchserver Documentation

Some Fossies usage hints in advance:

  1. To see the Doxygen generated documentation please click on one of the items in the steelblue colored "quick index" bar above or use the side panel at the left which displays a hierarchical tree-like index structure and is adjustable in width.
  2. If you want to search for something by keyword rather than browse for it you can use the client side search facility (using Javascript and DHTML) that provides live searching, i.e. the search results are presented and adapted as you type in the Search input field at the top right.
  3. Doxygen doesn't incorporate all member files but just a definable subset (basically the main project source code files that are written in a supported language). So to search and browse all member files you may visit the Fossies opensearchserver-1.5.14.src.tar.gz contents page and use the Fossies standard member browsing features (also with source code highlighting and additionally with optional code folding).
README.md

OpenSearchServer

Build Status Maven Central Join the chat at https://gitter.im/jaeksoft/opensearchserver

Copyright Emmanuel Keller / Jaeksoft (2008-2016) This software is licensed under the GPL v3.

OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, ...) and the REST/RESTFul API you will be able to integrate quickly and easily advanced full-text search capabilities in your application. OpenSearchServer runs on Linux/Unix/BSD/Windows.

Quickstart

One requirement

You need to have a JAVA 7 (or newer) runtime on your server

Download the last ZIP or the TAR.GZ archive:

http://www.opensearchserver.com/#download

Deflate the content to get the following files:

  • FILE opensearchserver.jar -> the main library
  • FILE README.md -> this file
  • DIR data -> will contains your index
  • DIR server -> will contains servers files
  • FILE start.sh -> Shell to start the server on Unix
  • FILE start.bat -> Batch to start the server on Windows
  • FILE NOTICE.txt -> the third-party license informations
  • DIR LICENSES -> Contains the detailled licenses

Edit the parameters

Optionally, can you change the parameters in the start.sh/start.bat script:

  • The allowed memory size
  • The TCP port (9090 by default)

Start the server

cd opensearchserver
./start.sh

Go with the interface and/or the API

http://localhost:9090

Features

Search functions

  • Advanced full-text search features
  • Phonetic search
  • Advanced boolean search with query language
  • Clustered results with faceting and collapsing
  • Filter search using sub-requests (including negative filters)
  • Geolocation
  • Spell-checking
  • Relevance customization
  • Search suggestion facility (auto-completion)

Indexation

  • Supports 18 languages
  • Fields schema with analyzers in each language
  • Several filters: n-gram, lemmatization, shingle, stripping diacritic from words,…
  • Automatic language recognition
  • Named entity recognition
  • Word synonyms and expression synonyms
  • Export indexed terms with frequencies
  • Automatic classification

Document supported

  • HTML / XHTML
  • MS Office documents (Word, Excel, Powerpoint, Visio, Publisher)
  • OpenOffice documents
  • Adobe PDF (with OCR)
  • RTF, Plaintext
  • Audio files metadata (wav, mp3, AIFF, Ogg)
  • Torrent files
  • OCR over images

Crawlers

  • The web crawler for internet, extranet and intranet
  • The file systems crawler for local and remote files (NFS, SMB/CIFS, FTP, FTPS, SWIFT)
  • The database crawler for all JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server, …)
  • Filter inclusion or exclusion with wildcards
  • Session parameters removal
  • SQL join and linked files support
  • Screenshot capture
  • Sitemap import

General

  • REST API (XML and JSON)
  • Monitoring module
  • Index replication
  • Scheduler for management of periodic tasks