/ /

Gmail connector for Simpplr enterprise search

Updated 15 days ago

Introduction

The Gmail connector allows Simpplr Enterprise Search to index messages from Google Workspace mailboxes in your domain, making email discoverable in enterprise search alongside other connected systems.

With this connector, you can:

  • Search mail across configured Workspace users without relying only on the Gmail client.

  • Unify email with other enterprise sources in one search experience.

  • Enforce document-level security (DLS) so users typically see only messages indexed from their own mailbox.

  • Run full and incremental syncs to keep the index updated.

Indexed content from this connector is available in:

  • Main enterprise search results

  • Smart Answers and other advanced search features (when enabled for your deployment)

Capabilities at a glance

Step

Behavior

User discovery

Google Directory API lists all users for the Customer ID

Message listing

Gmail messages.list with 730-day after: window (and optional advanced queries)

Message fetch

Full message content via Gmail API; RFC 822 parsed to subject, body, headers, labels

DLS

Owner email stamped on each document when enabled

Spam/Trash

Not included

Objects and content supported

Object types

Type

Description

Email (message)

One document per Gmail message ID in each mailbox crawled

User identity

Workspace user records for access-control sync —not searchable mail content

What is not a separate document type

  • Threads: Not one document per thread; only per-message documents.

  • Labels / folders: Labels are attributes on each message, not standalone documents.

  • Calendar, Drive, Chat: Not included.

Metadata captured (per message)

Field

Description

subject

Decoded email subject

email_body

Plain text from the message (HTML converted to text when needed)

email_metadata

Structured headers: from, to, cc, bcc, reply-to, message-id, dates

labels

Human-readable Gmail label names (e.g. INBOX, SENT, custom labels)

email_attachments

Filename, content type, size—attachment file content is not indexed

_timestamp

Gmail internalDate (when the message was received/stored)

object_type

email

allowaccess_control

Mailbox owner email when DLS is enabled

Permissions model (DLS)

When document level security is enabled (default in connector configuration):

  • Each message indexed from a user’s mailbox is stamped with that user’s primary email in allowaccess_control.

  • Search uses identity documents synced from the Google Directory API so users are intended to see only their own mailbox messages.

  • Google Groups, delegates, and shared-mailbox edge cases are not modeled beyond mailbox ownership.

Sync window

Full sync and incremental list fallback only index messages within a 730-day lookback (approximately two years), using a Gmail after:YYYY/MM/DD filter.

Spam and Trash are excluded (includeSpamTrash is off).

Versions and editions supported

Category

Details

Supported

Google Workspace with Gmail and Admin SDK / Directory API

Auth

Google Cloud service account with domain-wide delegation

API access

gmail.readonly, admin.directory.user.readonly (via delegated admin)

Not supported

Consumer Gmail (@gmail.com) accounts outside Workspace; attachment binary indexing; per-group ACL mapping

Prerequisites

Before you begin, ensure the following:

Google Workspace administration

  • A Google Workspace super admin or admin who can configure domain-wide delegation and OAuth client access.

  • Customer ID from Admin console → Account → Account settings.

  • A dedicated Google Cloud service account for the connector.

Domain-wide delegation

  1. Create a service account in Google Cloud and download the JSON key.

  2. In Google Workspace Admin, authorize the client ID with these OAuth scopes (read-only):

  3. Set the delegation subject to a Workspace admin email used for Directory and Gmail API calls.

Application credentials

Field

Required

Description

GMail service account JSON

Yes

Full service account key JSON

Google Workspace admin email

Yes

Admin email for domain-wide delegation (subject)

Google customer id

Yes

Workspace Customer ID

Authentication and security

Authentication mechanism

The connector uses a Google Cloud service account with domain-wide delegation. For each user mailbox, it impersonates that user’s primaryEmail when calling the Gmail API. Directory calls use the configured admin subject.

Connection validation checks both Gmail and Google Directory APIs.

Data security

Topic

Behavior

Credential storage

Service account JSON stored as a sensitive connector field

Permission enforcement

When DLS is on, query-time filters use allowaccess_control and synced user identities

API access level

Read-only Gmail and Directory scopes

Setup and configuration

Step 1 — Prepare Google Workspace and Cloud

  1. Create a Google Cloud Project - Log into Google Cloud Platform and go to the Console. 

  2. Click the Project Dropdown at the top left of the screen (next to the Google Cloud logo).

  3. Click New Project

    image1.png

  4. Give your project a name, change the project ID and click the Create button.

  5. Enable Google APIs -  Choose APIs & Services from the left menu and click on Enable APIs and Services. You need to enable the GMail API and the Google Admin SDK API.

    image6.pngimage17.pngimage1.pngimage14.png

    Similarly enable Google Admin SDK API following the same steps.

  6. Create a service account -  In the APIs & Services section, click on Credentials and click on Create credentials > Service account to create a service account. Give your service account a name and a service account ID. This is like an email address , copy the account as it will be used to identify your service account in the future. Click Done.
    image8.png

  7. Download the JSON key and store it securely.
    1. In the left sidebar, navigate to IAM & Admin > Service Accounts.
    2. Click the email address of the service account that you want to create a key for.
    3. Click the Keys tab. Click the Add key drop-down menu, then select Create new key.
    4. Select JSON as the Key type and then click Create. This will download a JSON file that will contain the service account credentials.
    image7.png

    image3.png

  8. While still on your Service Account's page, click on the Details tab at the top. Scroll down to the Advanced settings section. Look for Domain-wide Delegation and copy the Client ID (a long string of numbers).

  9. Open a new tab and go to Google Admin Console.

  10. In the left menu, go to Security > Access and data control > API control. Scroll down to the bottom of the page and click Manage Domain-wide delegation. Click add new and in the Client ID field, paste the service account client ID with the scopes listed below.
    In the OAuth scopes field, copy and paste these two URLs exactly in separate columns:

image2.pngimage9.png

image4.png

  1. Click Authorize.

  2. Note your Customer ID and choose an admin email for the delegation subject.

  3. Confirm the service account can list users and read mail for a test mailbox.

Step 2 — Create the connector in Simpplr Enterprise Search

  1. Go to Enterprise Search → Connectors → Add connector.

    image16.pngimage10.png

  2. Select GMail (Gmail).
    image12.png

  3. Enter:

    • Name and optional description

    • GMail service account JSON (paste full JSON)

    • Google Workspace admin email

    • Google customer id (Log into the Google Admin Console and in the left-hand navigation menu, go to Account > Account settings )

image5.png

  1. Configure Audience based filtering.

    • Include audiences

    • Exclude audiences.
      image11.png

  2. Click on Save and Sync.

  3. Run a full sync first. Incremental sync requires a successful full sync.

Step 3 — Monitor

  1. Review connector Health and last sync status.

  2. Confirm document counts grow after the first full sync.

  3. Schedule incremental sync after full sync (see Crawling and sync behavior).

image15.png

Access control sync (DLS)

When document level security is enabled:

  1. User sync — users.list for the Customer ID yields identity documents keyed by primaryEmail.

  2. Message sync — Each message document includes allowaccess_control: ["owner@company.com"] for the mailbox being crawled.

  3. Search — Queries apply a filter so the signed-in user’s email matches allowed identities on documents.

Only the mailbox owner’s primary email is used as the access principal; Google Groups are not expanded into the access list.

Crawling and sync behavior

Full sync

Used for initial load and periodic reconciliation.

Incremental sync

Supported. Requires a prior successful full sync.

Path

When used

Behavior

History API (preferred)

Per-user historyId stored from last sync

Processes adds, label changes, and deletes since startHistoryId

List fallback

No historyId, or history 404 (expired, ~7 days)

Re-lists messages in the 730-day window and indexes updates


What triggers reindexing or removal

Change in Gmail

Effect

New message

Indexed on next sync (full or incremental)

Updated labels / metadata

Re-indexed (incremental history or full/list)

Deleted message

Removed from index on incremental when History API reports deletion

User removed from Directory

Existing indexed mail may remain until re-sync/cleanup policies apply

Recommended schedule

Job type

Typical frequency

Full sync

Weekly (or after major config changes)

Incremental sync

Hourly or daily for active domains

Access control sync

Hourly when DLS is enabled

Expected latency

Changes appear in search after the next successful sync for that job type—not in real time. History-based incremental runs are much faster than full mailbox re-list for large domains.

Field mapping and search experience

Default field mapping

Source (Gmail / Connector)

Search Use

Gmail message ID

Document id (unique per mailbox copy)

Subject

subject (title)

Plain body

email_body (main full-text)

Headers object

email_metadata (from, to, cc, dates, etc.)

Label names

labels

Attachment list

email_attachments (metadata only)

internalDate

_timestamp

object_type = email

connector_type = gmail

Owner (DLS)

allowaccess_control

Search experience

  • Result layout: Subject, author, date, snippet from email_body.

  • Useful filters: Source = Gmail, date, participants (from/to addresses when mapped in search config).

  • Autocomplete: Often wired to subject and sender/recipient addresses.

  • Smart Answers / semantic search: Supported when email_body is indexed and embeddings are enabled.

Important search note

The same logical email in sender vs receiver mailboxes has different Gmail message IDs and therefore separate indexed documents. Users only see their mailbox copy when DLS is enabled.

Known limitations

Limitation

Details

Mailbox-scoped DLS

No Google Group or delegate sharing model in access lists

Attachment content

Filenames and types only; no PDF/Office text from attachments in the connector

730-day window

Older mail is not listed unless sync_from_timestamp extends policy (still bounded by rule design)

Shared mailboxes

Treated as the impersonated primary mailbox only

Cross-mailbox deduplication

Same thread may appear as multiple documents across users

Monitoring and troubleshooting

Connector health

Metric

Description

Last sync status

Success / Warning / Failed

Last sync time

When the job finished

Documents indexed

Approximate mail documents in the index

Common issues and resolutions

Issue: Authentication or test connection failed

Possible causes:

  • Invalid or expired service account JSON

  • Domain-wide delegation not configured or wrong scopes

  • Wrong admin subject or Customer ID

  • Gmail or Directory API not enabled in Google Cloud

Resolution:

  1. Re-verify JSON, client ID, and scopes in Workspace Admin.

  2. Confirm the admin email is a valid Workspace administrator.

  3. Re-run the sync.

Issue: Incremental sync fails immediately

Message: Full sync required first / last sync time not found.

Resolution:

  1. Run a successful full sync.

Issue: User has no new mail in search but mail exists in Gmail

Possible causes:

  • Message older than 730-day window

  • Message in Spam or Trash (excluded)

  • User sync skipped due to API error.

  • DLS: user searching a different identity than mailbox owner

Resolution:

  1. Check message date and label location in Gmail.

  2. Verify the user’s primary email.

When to contact support

Contact Simpplr Support if:

  • Authentication fails after delegation and scopes are verified

  • Incremental sync never runs despite successful full sync

  • Large portions of a domain are missing with no errors in the UI

  • Users can see other users’ mail.

Include: connector name, Customer ID (not secrets), approximate time, error screenshots.

Frequently asked questions

Q. Which mailboxes are indexed?
Ans: All users returned by Directory API for the configured Customer ID.

Q. How often should we sync?
Ans: Typical pattern: weekly full and daily incremental.

Q. Are attachments searchable?
Ans: Attachment metadata is stored; file contents are not extracted into the index by this connector.

Q. Can users see each other’s email?
Ans: No - Each user should only see messages indexed from their own mailbox.

Q. Why is mail older than two years missing?
Ans: The connector applies a default 730-day Gmail after: filter.

Q. Is Spam or Trash indexed?
Ans: No.

Was this article helpful?
Subscribe to receive updates on this article