The SharePoint connector allows Simpplr Enterprise Search to index Microsoft SharePoint Storage content, making it easily discoverable and searchable directly within Simpplr.
With this connector, you can (use cases):
Bring SharePoint content into Simpplr Enterprise Search so users can find files alongside intranet content in one place.
Respect SharePoint access permissions so users only see files they already have access to in SharePoint.
Use advanced features like autocomplete, hybrid ranking, and Smart Answers on top of SharePoint content.
Indexed content from Simpplr Enterprise Search is available in:
In main search listing
Smart answers
Content types | Sites, Drive Items , Pages |
Metadata | Name, URL, Owner details, Updated By Details Created Time and Updated Time, Parent Reference, Object type, Extension and Size, Mime Type, Permissions (users and groups level access) |
Permissions | User and Group based permissions |
Indexing | Initial full crawl when the connector is created, followed by a weekly full crawl. Incremental updates run every hour. |
Multiple instances support | Multiple SharePoint connections can be configured in the Simpplr environment. |
Ingestion Filters |
|
Search features |
|
Objects - List the object types that are indexed, for example:
Drive Items
Pages
Sites
Metadata - For each indexed item, SharePoint captures:
Name
URL / link
Owner details
Update by details
Created time and Updated Time
Parent Details
Object Type
File extension and size
Mime Type
Permissions (users and groups level access)
Permissions model - Permissions are read from SharePoint and enforced in Simpplr Enterprise Search:
How user and group permissions are synchronized
SharePoint user and group memberships are fetched and stored in the ACL index.
When a user is added to or removed from a SharePoint group, the ACL index is updated the next time the ACL sync runs (by default, every hour).
How public or link-shared content is handled
Content that is only available via anonymous or public shared links is not indexed in the current version.
What happens when access is removed for a specific document
When a user loses access to a file or folder in SharePoint, the updated permissions are applied during the next ACL sync.
The file will no longer appear in that user’s Simpplr search results after the ACL sync completes.
Versions and editions supported
Supports enabled SharePoint Storage Service.
Before you begin, ensure the following:
Source system Access
Access to Microsoft Entra user account
Application / service account permissions
Sufficient permissions to register an application with your Microsoft Entra tenant, and assign to the application a role in your Azure subscription. To complete these tasks, you'll need the Application.ReadWrite.All permission.
Ability to grant the admin consent to the application from the Admin console (If you are not an admin, you need to request the Admin to grant consent via their Azure Portal).
SharePoint Source documentation:
Authentication mechanism
Describe how Simpplr Enterprise Search connects to SharePoint
There are two options available for assigning permissions:
Full Control (Recommended, low effort setup) - Grants access to all sites
Selected Site Access (High effort setup) - Grants access to specific sites
Graph API
Sites.Read.All
Files.Read.All
Group.Read.All
User.Read.All
Sites.FullControl.All
Sharepoint :
Sites.Read.All
Sites.FullControl.All
Ensure admin consent is granted for all the above permissions.
Note: The Sites.FullControl.All permission is required for the following reasons:
Microsoft Graph API: This permission is essential for scanning permission hierarchies using the /delta endpoint.
SharePoint REST API: This permission is necessary to successfully retrieve permissions for site pages and site lists.
For Reference: https://learn.microsoft.com/en-us/graph/api/driveitem-delta?view=graph-rest-1.0&tabs=http#scanning-permissions-hierarchies
This is a multi step process:
Grant the required permissions (Sites.Selected, instead of Sites.FullControl.All)
Fetch the Site Ids for which access needs to be granted
Assign permissions to the application for each of the Sites
Step 1 - Grant Required Permissions
Microsoft Graph API
Sites.Read.All
Files.Read.All
Group.Read.All
User.Read.All
Sites.Selected
SharePoint
Sites.Read.All
Sites.Selected
Ensure admin consent is granted for all the above permissions.
Post-Setup Configuration in Azure Portal
Once the application is registered and permissions are granted, you must explicitly allow access to specific SharePoint sites (whitelisting), since Sites.Selected restricts access by default. The user authenticating the below commands should have Full Control access on the Sites that are getting whitelisted.
Step 2 - Fetch Site Ids (API Explorer)
Use the Microsoft Graph Explorer (https://developer.microsoft.com/en-us/graph/graph-explorer ) to retrieve the Site ID.
Authenticate using the user who has Admin access and Has full control permission on Sharepoint Sites
Search for search for a SharePoint site by keyword and paste this in the url tab
GET https://graph.microsoft.com/v1.0/sites?select=webUrl,Title,Id&$search="<Site Name>*"This will return a list of sites matching the search term, including their Id.
Step 3 - Assign Permissions to the Sites (API Explorer)
Search for Sites Permissions
Use the retrieved siteId in the above to to grant access to your application:
POST [https://graph.microsoft.com/v1.0/sites/<siteId>/permissions]{
"roles": [
"fullcontrol"
],
"grantedToIdentities": [
{
"application": {
"id": "<App_Client_ID>",
"displayName": "<App_Display_Name>"
} } ]
}
Data security
Data storage and residency: Indexed content and ACLs from SharePoint are stored within your Simpplr Enterprise Search environment, in the same region as your Simpplr tenant.
Encryption in transit: Server-side encryption with Amazon S3 managed keys (SSE-S3), TLS encryption in Kafka.
Encryption at rest: SSL (TLS 1.2 or higher), Auth: OAuth 2.0 client-credentials.
Permission enforcement: SharePoint access controls (users and groups) are stored in the ACL index and applied at query time. Search results are always filtered by the signed-in user’s identity and SharePoint group memberships.
Go to the Azure portal and sign in with your Azure account.
Search and Navigate to the “App Registration” service.
Click on the New registration button to register a new application.
Provide a name for your app, and optionally select the supported account types (e.g., single tenant, multi-tenant) based on your Entra-ID.
Click on the Register button to create the app registration.
After the registration is complete, you will be redirected to the app’s overview page. Take note of the Application (client) ID value and Directory (Tenant) ID, as you’ll need them later.
Create a certificate and private key. This can, for example, be done by running the given command:
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azure_app.key -out azure_app.crt
Enter the required necessary details of your tenant (Optional, can leave blank). This will generate two files:
azure_app.crt (To be uploaded on Azure and Simpplr Admin Panel)
azure_app.key (To be uploaded on Simpplr Admin Panel)
Store both in a safe and secure place
Locate the Certificates by navigating to Client credentials: Certificates & Secrets.
Select Upload certificate
Upload the certificate created in one of previous steps: azure_app.crt
Navigate to Manage → API permissions section and click on the Add a permission button.
14. Click on Add a permission and in the Request API permissions pane, select Microsoft Graph as the API.
Choose the application permissions and select the following permissions under the Application tab based on the Permission you have given , you can either go with Site.Full.Control.All or Site.Selected.All:
Graph API
- Sites.Read.All
- Files.Read.All
- Group.Read.All
- User.Read.All
- Sites.FullControl.All/Site.Selected
Sharepoint
- Sites.Read.All
- Sites.FullControl.All/Site.Selected
Similarly add the remaining permissions (Sites.Read.All, Group.Read.All)
Click on the Add permissions button to add the selected permissions to your app. Finally, click on the Grant admin consent button to grant the required permissions to the app. This step requires administrative privileges.
18. To Fetch the tenant name , Navigate to Microsoft Entra ID , In the home page find the primary domain which is simplrdev.onmicrosoft.com , in here the tenant name is simplrdev
In Simpplr, go to: Manage Features → Enterprise Search → Add Source.
Search and Select “ Microsoft Sharepoint”.
Enter basic information:
Connection Name: (Connector Name for this instance)
Provide authentication details and select the certificate toggle in authentication method (Copied from the client Application):
Tenant ID
Tenant Name: Locate primary domain from the Entra id Portal (Ex: simpplr.onmicrosoft.com -> the tenant name is simpplr)
Client ID
Auth Method - Use “Certificate”
Certificate Key - Contents of azure_app.key (Make sure to copy as it is)
On Windows CMD, you can use “type azure_app.key”
On Linux, Windows Powershell, Mac, you can use “cat azure_app.key”
Certificate Secret - Contents of azure_app.crt (Make sure to copy as it is, similar to Certificate Key)
Click “Save” and “confirm”.
Configure Site based rules:
Include Specific Sites: Include the files belonging to specific Site (link or ID)
Exclude Specific Sites: Exclude the files belonging to specific Site (link or ID)
Configure Common Filters rules (Exclusion only):
File extension (e.g., .zip, .exe)
File size above a specified threshold
Document age (e.g., older than specified date)
Configure Audience based filtering.
Include audiences
Exclude audiences
Default schedule: Full crawl at first setup and once in a week, incremental sync every 4 hours, ACL runs every hour
Configuration options:
No option to configure the sync schedule, however sync can be paused and resumed manually
Monitor the initial full sync status (starts automatically) in the connector dashboard.
Crawling and sync behavior
How the connector works over time:
Initial full crawl
All the content present in the Sharepoint Storage account is indexed during the first run
How long it may take: Depends on the size of the content.
Incremental updates
Mechanism: Based on Timestamp of previous sync.
What changes trigger reindexing:
New items created
Existing items updated
Permissions changed
Items moved or renamed
Items deleted or archived
Deletion and permission changes
Deleted items are removed from index at next sync.
Permission changes are updated at the next sync cycle.
Expected latency
With the default schedule (incremental sync every 4 hours and ACL sync every hour), changes made to SharePoint content are generally reflected in Simpplr search results within 4 hours of the update, and the permission lag in the system can be up-to 4hours. (as the incremental Sync every 4 hours). On top of that, there can be certain cases, where the permission sync can take up-to 7days (When the full sync is run), subject to content volume and system load.
Default field mapping
Source field SharePoint → Index field Simpplr
title | name |
url | webUrl |
owner/created by | createdBy.user.email |
file type |
|
Last modified | lastModifiedDateTime |
Created data | createdDateTime |
size | size |
permissions /access control | _allow_access_control |
Search experience - how content from this connector appears in search:
Result layout: (Icon, Connector name, Folder name, title (name) as link, body(excerpt), Created Date, File Type icon, file type)
Page ( Icon , Connector name , Content title , body(excerpt), Created Date , Author , Updated Timestamp)
Site (Icon , Site Display Name as Link )
Available filters and facets:
Sources = SharePoint
File type
Owner
Created Date
Participation in advanced features:
Smart Answers / Q&A: Yes
Autocomplete: Yes
Recommendations / “Suggested for you”: N/A
Trending / popular results: N/A
Semantic / hybrid ranking: Yes
Limits and known limitations
Maximum file size indexed | Files bigger than 10 MB won’t be extracted. |
Unsupported file types | Compressed files are not supported, e.g., an archive file containing a set of PDFs (The file content is not searchable, however, the users can still search via file title.) |
Rate limits | N/A |
Preview limitations | No preview available for excel, or media files. |
Permission edge case | Permission changes are not synced unless ACL sync is run. |
Other known limitations |
|
Connector health and monitoring - Describe where admins can see status information:
Enterprise Search → Connector name
Available metrics:
Last sync status (Success / Warning / Failed)
Last sync time
Next scheduled sync
Sync Type
Total items indexed count
Common issues and resolutions. Example pattern:
Issue: Authentication failed, Failed to fetch the access token (invalid credentials or missing scopes)
Possible causes:
Incorrect client ID or secret
App not granted the required permissions
App not granted with the Admin consent
Resolution:
Verify and re-enter credentials
Confirm required permissions are granted
Confirm if the App is granted with the Admin consent.
When to contact support.
Authentication error persists even after trying the above-mentioned resolutions
Sync is stuck in the Pending state,
Sync is in progress but no documents are getting ingested.
Sync failure with cancelled error (when not cancelled manually)
Incomplete or Partial sync.
When contacting Support, include:
Connector name and instance ID (if available)
Organization URL
Approximate time and date of the issue
Error messages or screenshots
Steps you already tried
Can I connect multiple SharePoint tenants or domains?
A. Multiple SharePoint connections can be configured in the Simpplr environment.
Q2. How often does SharePoint sync data?
A. The connector runs a full crawl on first setup and then once per week. Incremental sync runs every hour.
Q3. Are comments, revisions, or version history indexed?
A. Comments and individual versions are not indexed as separate items. The connector indexes the latest file metadata, including the last updated time and updated-by user.
Q4. Does the connector index content from external guests or shared links?
A. No we don't ingest the content from external guests and sharedlinks
Q5. What happens when a user loses access to an item in SharePoint Storage ?
A. The updated access permissions will be indexed during the next sync.
Note: Files and permissions are synced every hour. However, the actual update time may vary depending on the volume of data created within that period. Under normal conditions, changes are reflected within 1–2 hours, provided there has not been a significant spike in data uploads.
Q6. Can I exclude certain sites/teams/folders from being indexed?
A. Documents can be included and excluded based on the Sites and IDs. Documents can also be excluded based on file extension, size, and age. Additionally, documents can be included or excluded based on audiences.