SOLR Security with ManifoldCF
The ManifoldCF security model is based loosely on the standard authorization concepts and hierarchies found in Microsoft’s Active Directory. ManifoldCF defines a concept of an access token. ManifoldCF security model, it is the job of an authority to provide a list of access tokens for a given searching user. Multiple authorities cooperate in that each one can add to the list of access tokens describing a given user’s security. Below sections is about how to set up ManifoldCF, ManifoldCF crawler usage and to configure ManifoldCF plugin with SOLR.
- Setup ManifoldCF
- Configuration of ManifoldCF with SOLR
Setup ManifoldCF: -
- This section explains how to setup ManifoldCF.
- Download ManifoldCF binary distribution from https://manifoldcf.apache.org/en_US/download.html and unzip it
- Open the command prompt and use start.bat to start ManifoldCF as shown below
This will start ManifoldCF- required services running and desired connection types properly registered
- ManifoldCF user interface can access using crawler.
- When enter the Framework user interface the first time, you will first be asked to log in
Enter the login username and password for your system. By default, the username is “admin” and the password is “admin”. The screen should look something like this.
- Create an output connection by clicking the “List Output Connections”
- Enter Name, description and select Type tab to select SOLR output connection and continue
Select single server from Solr type, since we are setting up in single box.
Select Server tab to configure SOLR
Select schema tab to enter primary key information of existing Solr collection and save.
Create an authority group by clicking the “List Authority Groups and “Add a new authority group”
User Mapping Connections
Create a mapping connection by clicking the “List User Mapping Connections” and “Add a new connection”
Select type as regular expression mapper and save. If everything is good then crawler displays “connection working”
- Create an authority connection by clicking the “List Authority Connections” link
- Create a new connection by clicking “Add new connection”
- Enter name and description and select type to select Authority type as follows.
- Select authority group which create before and save it.
in with SOLR: -
This section guides step by step process to configure ManifoldCF plugin with Solr
- Copy from $:\apache-manifoldcf-2.3\plugins\solr\solr-X.X\apache-manifoldcf-solr-X.X-plugin-2.2.JAR to Solr core lib directory
There are two ways to hook up security to Solr in this package. The first is using a Query Parser plugin. The second is using a Search Component. In both cases, the first step is to have ManifoldCF installed and running.
- Then, you will need to add fields to your Solr schema.xml file that can be used to contain document authorization information. There is a need to be six of these fields, ‘allow’ and ‘deny’ field for documents, parents, and shares. For example
- The default value of “__nosecurity__” is required by this plugin, so do not forget to include it.
Using the Query Parser Plugin
To set up the query parser plugin, modify your solrconfig.xml to add the query parser:
MCF Authority Service:
Access Token: ManifoldCF defines a concept of an access token. An access token, to ManifoldCF, is a string which is meaningful only to a specific connector or connectors. This string describes the ability of a user to view (or not view) some set of documents. To see access token use following URL.
Indexing data to SOLR:
- Start Solr instance and using following xml data, post xml to Solr. In this example see highlighted text to provide user token to access document
Query data using SOLR Admin:
- Query data without providing user token then Solr will return no results which are having user token as “__nosecurity” (default token). In above scenario Solr will not return results above document.
- Query with following user tokens then Solr will all the results along with above results.
URL for Ref