AI Segment Builder

This document covers the AI-powered segment builder that converts natural language queries into segment criteria.

Overview

The AI Segment Builder allows users to create segment rules using natural language instead of manually constructing JSON criteria. Users describe their target audience in plain English, and the AI translates it into valid segment criteria.

User: "Find platinum members who haven't booked in 60 days"

AI Output:
{
  "$and": [
    { "membershipTier": "PLATINUM" },
    { "lastBookingAt": { "$lt": "2025-10-17T00:00:00Z" } }
  ]
}

Architecture

User Query                    AI Service                     Output
─────────────────────────────────────────────────────────────────────
"High-value members     →   GPT-4o-mini with    →   { "$and": [
 at risk of churning"       prompt + validation      { "lifetimeValue": { "$gte": 100000 } },
                                                     { "churnRiskScore": { "$gte": 70 } }
                                                   ]}

Components

Component	Description
`SegmentBuilderService`	Main orchestrator service
`SegmentBuilderPromptBuilder`	Constructs system/user prompts
`SegmentBuilderValidator`	Validates AI output against field registry

API Endpoints

Build Segment from Natural Language

POST /v1/ai/segments/build

Converts a natural language query into segment criteria.

Request Body

{
  "query": "Find platinum and diamond members who are active",
  "tenantId": "tenant_123"
}

Response: 200 OK

{
  "success": true,
  "result": {
    "criteria": {
      "$and": [
        { "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } },
        { "status": "ACTIVE" }
      ]
    },
    "explanation": "Platinum and diamond tier members with active status",
    "fieldsMapped": ["membershipTier", "status"],
    "confidence": 0.95,
    "ambiguities": []
  },
  "previewCount": 342,
  "usage": {
    "promptTokens": 1240,
    "completionTokens": 89
  }
}

Refine Existing Criteria

POST /v1/ai/segments/refine

Refines existing segment criteria with additional natural language instructions.

Request Body

{
  "instruction": "Also require email opt-in",
  "currentCriteria": {
    "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] }
  },
  "tenantId": "tenant_123"
}

Response: 200 OK

{
  "success": true,
  "result": {
    "criteria": {
      "$and": [
        { "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } },
        { "emailOptIn": true }
      ]
    },
    "explanation": "Platinum and diamond members who opted in to email",
    "fieldsMapped": ["membershipTier", "emailOptIn"],
    "confidence": 0.92,
    "ambiguities": []
  },
  "previewCount": 298
}

Preview Count

POST /v1/ai/segments/preview

Returns the count of customers matching the criteria.

Request Body

{
  "criteria": {
    "engagementScore": { "$gte": 70 }
  },
  "tenantId": "tenant_123"
}

Response: 200 OK

{
  "count": 456
}

Get Available Fields

GET /v1/ai/segments/fields

Returns all available fields for segment criteria.

Response: 200 OK

{
  "fields": [
    {
      "name": "engagementScore",
      "type": "number",
      "operators": ["eq", "neq", "lt", "lte", "gt", "gte"],
      "description": "Engagement score (0-100)"
    },
    {
      "name": "membershipTier",
      "type": "string",
      "operators": ["eq", "neq", "in", "nin"],
      "description": "Membership tier (e.g., PLATINUM, GOLD, SILVER)"
    }
  ]
}

Criteria Format

The AI generates criteria using MongoDB-style operators:

Operators

Operator	Description	Example
`$eq`	Equals	`{ "status": { "$eq": "ACTIVE" } }`
`$neq`	Not equals	`{ "status": { "$neq": "DELETED" } }`
`$gt`	Greater than	`{ "engagementScore": { "$gt": 70 } }`
`$gte`	Greater than or equal	`{ "engagementScore": { "$gte": 70 } }`
`$lt`	Less than	`{ "handicap": { "$lt": 10 } }`
`$lte`	Less than or equal	`{ "churnRiskScore": { "$lte": 30 } }`
`$in`	In array	`{ "membershipTier": { "$in": ["GOLD", "PLATINUM"] } }`
`$nin`	Not in array	`{ "status": { "$nin": ["MERGED", "DELETED"] } }`
`$contains`	String contains	`{ "email": { "$contains": "@company.com" } }`
`$startsWith`	String starts with	`{ "firstName": { "$startsWith": "Jo" } }`
`$endsWith`	String ends with	`{ "email": { "$endsWith": ".co.za" } }`

Combinators

Combinator	Description	Example
`$and`	All conditions must match	`{ "$and": [...conditions] }`
`$or`	Any condition must match	`{ "$or": [...conditions] }`
`$not`	Negate condition	`{ "$not": { "status": "DELETED" } }`

Date Placeholders

The AI can use date placeholders that are resolved at query time:

Placeholder	Description	Example
`{{N_DAYS_AGO}}`	N days before current date	`{{30_DAYS_AGO}}`
`{{N_WEEKS_AGO}}`	N weeks before current date	`{{2_WEEKS_AGO}}`
`{{N_MONTHS_AGO}}`	N months before current date	`{{3_MONTHS_AGO}}`
`{{START_OF_YEAR}}`	January 1st of current year	`2025-01-01T00:00:00Z`
`{{START_OF_MONTH}}`	1st of current month	`2025-12-01T00:00:00Z`

Available Fields

Identity & Contact

Field	Type	Operators	Description
`email`	string	eq, neq, contains, startsWith, endsWith	Customer email
`phoneNumber`	string	eq, neq, startsWith	E.164 format phone
`firstName`	string	eq, neq, contains, startsWith	First name
`lastName`	string	eq, neq, contains, startsWith	Last name
`country`	string	eq, neq, in, nin	ISO 3166-1 alpha-2 code
`city`	string	eq, neq, in, contains	City name

Status & Lifecycle

Field	Type	Values	Description
`status`	enum	ACTIVE, INACTIVE, MERGED, DELETED	Customer status
`lifecycleStage`	enum	SUBSCRIBER, LEAD, OPPORTUNITY, CUSTOMER, ADVOCATE	Lifecycle stage
`tags`	array	-	Customer tags

Membership

Field	Type	Description
`membershipTier`	string	Membership tier (PLATINUM, GOLD, etc.)
`membershipStatus`	string	Membership status
`memberSince`	datetime	Membership start date
`membershipExpiry`	datetime	Membership expiry date

Field	Type	Description
`emailOptIn`	boolean	Email marketing consent
`smsOptIn`	boolean	SMS marketing consent
`pushOptIn`	boolean	Push notification consent
`whatsappOptIn`	boolean	WhatsApp consent

Engagement Scores

Field	Type	Description
`engagementScore`	number	Engagement score (0-100)
`churnRiskScore`	number	Churn risk score (0-100)
`lifetimeValue`	number	Lifetime value in cents
`predictedLtv`	number	Predicted lifetime value

Activity Metrics

Field	Type	Description
`lastActivityAt`	datetime	Last activity timestamp
`activityCount30d`	number	Activities in last 30 days
`activityCount90d`	number	Activities in last 90 days
`bookingCount30d`	number	Bookings in last 30 days
`bookingCount90d`	number	Bookings in last 90 days
`lastBookingAt`	datetime	Last booking timestamp
`lastEmailOpenAt`	datetime	Last email open
`lastEmailClickAt`	datetime	Last email click

Golf Data

Field	Type	Description
`handicap`	number	Golf handicap (0-54)
`handicapProvider`	string	Provider (SAGA, USGA)
`homeClubId`	string	Home club ID
`homeClubName`	string	Home club name

Example Queries

Natural Language Query	Generated Criteria
"Platinum and diamond members"	`{ "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } }`
"Golfers with handicap under 10"	`{ "handicap": { "$lt": 10 } }`
"Active customers who haven't booked in 60 days"	`{ "$and": [{ "status": "ACTIVE" }, { "lastBookingAt": { "$lt": "{{60_DAYS_AGO}}" } }] }`
"High engagement, low churn risk"	`{ "$and": [{ "engagementScore": { "$gte": 70 } }, { "churnRiskScore": { "$lte": 30 } }] }`
"Email subscribers in South Africa"	`{ "$and": [{ "emailOptIn": true }, { "country": "ZA" }] }`
"New members this year"	`{ "memberSince": { "$gte": "{{START_OF_YEAR}}" } }`
"Inactive members with high LTV"	`{ "$and": [{ "status": "INACTIVE" }, { "lifetimeValue": { "$gte": 500000 } }] }`

Error Handling

Error Codes

Code	Description	Resolution
`AMBIGUOUS_QUERY`	Query is too vague (confidence < 0.5)	Clarify with suggestions
`INVALID_FIELD`	AI returned unknown field	Check available fields
`PARSE_ERROR`	AI returned invalid JSON	Retry or rephrase
`AI_ERROR`	OpenAI API failure	Check API key, retry

Ambiguous Query Response

When confidence is low, the response includes suggestions:

{
  "success": false,
  "error": {
    "code": "AMBIGUOUS_QUERY",
    "message": "Query is too ambiguous to build segment",
    "suggestions": [
      "What defines 'high engagement'? Specify a score threshold.",
      "Do you want email opt-in customers only?"
    ]
  }
}

Configuration

Environment Variables

Variable	Description	Default
`AI_OPENAI_KEY`	OpenAI API key	Required
`AI_MODEL`	Model to use	`gpt-4o-mini`
`AI_MAX_TOKENS`	Max response tokens	`2048`
`AI_TEMP`	Temperature (0-1)	`0.2`

Validation

The validator ensures AI output is safe and valid:

Field validation: All fields must exist in registry
Operator validation: Operators must be valid for field type
Enum validation: Enum values must be in allowed list
Type validation: Values must match field types
Date normalization: Placeholders resolved to ISO strings

Field Suggestions

For typos, the validator suggests similar fields:

{
  "success": false,
  "error": {
    "code": "INVALID_FIELD",
    "message": "Unknown field: engagmentScore",
    "suggestions": ["engagementScore"]
  }
}

Overview​

Architecture​

Components​

API Endpoints​

Build Segment from Natural Language​

Refine Existing Criteria​

Preview Count​

Get Available Fields​

Criteria Format​

Operators​

Combinators​

Date Placeholders​

Available Fields​

Identity & Contact​

Status & Lifecycle​

Membership​

Marketing Consent​

Engagement Scores​

Activity Metrics​

Golf Data​

Example Queries​

Error Handling​

Error Codes​

Ambiguous Query Response​

Configuration​

Environment Variables​

Validation​

Field Suggestions​

Related Documents​

Overview

Architecture

Components

API Endpoints

Build Segment from Natural Language

Refine Existing Criteria

Preview Count

Get Available Fields

Criteria Format

Operators

Combinators

Date Placeholders

Available Fields

Identity & Contact

Status & Lifecycle

Membership

Marketing Consent

Engagement Scores

Activity Metrics

Golf Data

Example Queries

Error Handling

Error Codes

Ambiguous Query Response

Configuration

Environment Variables

Validation

Field Suggestions

Related Documents