Skip to main content

AI Segment Builder

This document covers the AI-powered segment builder that converts natural language queries into segment criteria.


Overview

The AI Segment Builder allows users to create segment rules using natural language instead of manually constructing JSON criteria. Users describe their target audience in plain English, and the AI translates it into valid segment criteria.

User: "Find platinum members who haven't booked in 60 days"

AI Output:
{
"$and": [
{ "membershipTier": "PLATINUM" },
{ "lastBookingAt": { "$lt": "2025-10-17T00:00:00Z" } }
]
}

Architecture

User Query                    AI Service                     Output
─────────────────────────────────────────────────────────────────────
"High-value members → GPT-4o-mini with → { "$and": [
at risk of churning" prompt + validation { "lifetimeValue": { "$gte": 100000 } },
{ "churnRiskScore": { "$gte": 70 } }
]}

Components

ComponentDescription
SegmentBuilderServiceMain orchestrator service
SegmentBuilderPromptBuilderConstructs system/user prompts
SegmentBuilderValidatorValidates AI output against field registry

API Endpoints

Build Segment from Natural Language

POST /v1/ai/segments/build

Converts a natural language query into segment criteria.

Request Body

{
"query": "Find platinum and diamond members who are active",
"tenantId": "tenant_123"
}

Response: 200 OK

{
"success": true,
"result": {
"criteria": {
"$and": [
{ "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } },
{ "status": "ACTIVE" }
]
},
"explanation": "Platinum and diamond tier members with active status",
"fieldsMapped": ["membershipTier", "status"],
"confidence": 0.95,
"ambiguities": []
},
"previewCount": 342,
"usage": {
"promptTokens": 1240,
"completionTokens": 89
}
}

Refine Existing Criteria

POST /v1/ai/segments/refine

Refines existing segment criteria with additional natural language instructions.

Request Body

{
"instruction": "Also require email opt-in",
"currentCriteria": {
"membershipTier": { "$in": ["PLATINUM", "DIAMOND"] }
},
"tenantId": "tenant_123"
}

Response: 200 OK

{
"success": true,
"result": {
"criteria": {
"$and": [
{ "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } },
{ "emailOptIn": true }
]
},
"explanation": "Platinum and diamond members who opted in to email",
"fieldsMapped": ["membershipTier", "emailOptIn"],
"confidence": 0.92,
"ambiguities": []
},
"previewCount": 298
}

Preview Count

POST /v1/ai/segments/preview

Returns the count of customers matching the criteria.

Request Body

{
"criteria": {
"engagementScore": { "$gte": 70 }
},
"tenantId": "tenant_123"
}

Response: 200 OK

{
"count": 456
}

Get Available Fields

GET /v1/ai/segments/fields

Returns all available fields for segment criteria.

Response: 200 OK

{
"fields": [
{
"name": "engagementScore",
"type": "number",
"operators": ["eq", "neq", "lt", "lte", "gt", "gte"],
"description": "Engagement score (0-100)"
},
{
"name": "membershipTier",
"type": "string",
"operators": ["eq", "neq", "in", "nin"],
"description": "Membership tier (e.g., PLATINUM, GOLD, SILVER)"
}
]
}

Criteria Format

The AI generates criteria using MongoDB-style operators:

Operators

OperatorDescriptionExample
$eqEquals{ "status": { "$eq": "ACTIVE" } }
$neqNot equals{ "status": { "$neq": "DELETED" } }
$gtGreater than{ "engagementScore": { "$gt": 70 } }
$gteGreater than or equal{ "engagementScore": { "$gte": 70 } }
$ltLess than{ "handicap": { "$lt": 10 } }
$lteLess than or equal{ "churnRiskScore": { "$lte": 30 } }
$inIn array{ "membershipTier": { "$in": ["GOLD", "PLATINUM"] } }
$ninNot in array{ "status": { "$nin": ["MERGED", "DELETED"] } }
$containsString contains{ "email": { "$contains": "@company.com" } }
$startsWithString starts with{ "firstName": { "$startsWith": "Jo" } }
$endsWithString ends with{ "email": { "$endsWith": ".co.za" } }

Combinators

CombinatorDescriptionExample
$andAll conditions must match{ "$and": [...conditions] }
$orAny condition must match{ "$or": [...conditions] }
$notNegate condition{ "$not": { "status": "DELETED" } }

Date Placeholders

The AI can use date placeholders that are resolved at query time:

PlaceholderDescriptionExample
{{N_DAYS_AGO}}N days before current date{{30_DAYS_AGO}}
{{N_WEEKS_AGO}}N weeks before current date{{2_WEEKS_AGO}}
{{N_MONTHS_AGO}}N months before current date{{3_MONTHS_AGO}}
{{START_OF_YEAR}}January 1st of current year2025-01-01T00:00:00Z
{{START_OF_MONTH}}1st of current month2025-12-01T00:00:00Z

Available Fields

Identity & Contact

FieldTypeOperatorsDescription
emailstringeq, neq, contains, startsWith, endsWithCustomer email
phoneNumberstringeq, neq, startsWithE.164 format phone
firstNamestringeq, neq, contains, startsWithFirst name
lastNamestringeq, neq, contains, startsWithLast name
countrystringeq, neq, in, ninISO 3166-1 alpha-2 code
citystringeq, neq, in, containsCity name

Status & Lifecycle

FieldTypeValuesDescription
statusenumACTIVE, INACTIVE, MERGED, DELETEDCustomer status
lifecycleStageenumSUBSCRIBER, LEAD, OPPORTUNITY, CUSTOMER, ADVOCATELifecycle stage
tagsarray-Customer tags

Membership

FieldTypeDescription
membershipTierstringMembership tier (PLATINUM, GOLD, etc.)
membershipStatusstringMembership status
memberSincedatetimeMembership start date
membershipExpirydatetimeMembership expiry date
FieldTypeDescription
emailOptInbooleanEmail marketing consent
smsOptInbooleanSMS marketing consent
pushOptInbooleanPush notification consent
whatsappOptInbooleanWhatsApp consent

Engagement Scores

FieldTypeDescription
engagementScorenumberEngagement score (0-100)
churnRiskScorenumberChurn risk score (0-100)
lifetimeValuenumberLifetime value in cents
predictedLtvnumberPredicted lifetime value

Activity Metrics

FieldTypeDescription
lastActivityAtdatetimeLast activity timestamp
activityCount30dnumberActivities in last 30 days
activityCount90dnumberActivities in last 90 days
bookingCount30dnumberBookings in last 30 days
bookingCount90dnumberBookings in last 90 days
lastBookingAtdatetimeLast booking timestamp
lastEmailOpenAtdatetimeLast email open
lastEmailClickAtdatetimeLast email click

Golf Data

FieldTypeDescription
handicapnumberGolf handicap (0-54)
handicapProviderstringProvider (SAGA, USGA)
homeClubIdstringHome club ID
homeClubNamestringHome club name

Example Queries

Natural Language QueryGenerated Criteria
"Platinum and diamond members"{ "membershipTier": { "$in": ["PLATINUM", "DIAMOND"] } }
"Golfers with handicap under 10"{ "handicap": { "$lt": 10 } }
"Active customers who haven't booked in 60 days"{ "$and": [{ "status": "ACTIVE" }, { "lastBookingAt": { "$lt": "{{60_DAYS_AGO}}" } }] }
"High engagement, low churn risk"{ "$and": [{ "engagementScore": { "$gte": 70 } }, { "churnRiskScore": { "$lte": 30 } }] }
"Email subscribers in South Africa"{ "$and": [{ "emailOptIn": true }, { "country": "ZA" }] }
"New members this year"{ "memberSince": { "$gte": "{{START_OF_YEAR}}" } }
"Inactive members with high LTV"{ "$and": [{ "status": "INACTIVE" }, { "lifetimeValue": { "$gte": 500000 } }] }

Error Handling

Error Codes

CodeDescriptionResolution
AMBIGUOUS_QUERYQuery is too vague (confidence < 0.5)Clarify with suggestions
INVALID_FIELDAI returned unknown fieldCheck available fields
PARSE_ERRORAI returned invalid JSONRetry or rephrase
AI_ERROROpenAI API failureCheck API key, retry

Ambiguous Query Response

When confidence is low, the response includes suggestions:

{
"success": false,
"error": {
"code": "AMBIGUOUS_QUERY",
"message": "Query is too ambiguous to build segment",
"suggestions": [
"What defines 'high engagement'? Specify a score threshold.",
"Do you want email opt-in customers only?"
]
}
}

Configuration

Environment Variables

VariableDescriptionDefault
AI_OPENAI_KEYOpenAI API keyRequired
AI_MODELModel to usegpt-4o-mini
AI_MAX_TOKENSMax response tokens2048
AI_TEMPTemperature (0-1)0.2

Validation

The validator ensures AI output is safe and valid:

  1. Field validation: All fields must exist in registry
  2. Operator validation: Operators must be valid for field type
  3. Enum validation: Enum values must be in allowed list
  4. Type validation: Values must match field types
  5. Date normalization: Placeholders resolved to ISO strings

Field Suggestions

For typos, the validator suggests similar fields:

{
"success": false,
"error": {
"code": "INVALID_FIELD",
"message": "Unknown field: engagmentScore",
"suggestions": ["engagementScore"]
}
}