Skip to main content

Data Sources

Data sources are the foundation of Chat Aid. By connecting your company's apps and documents, you enable Chat Aid to answer questions with accurate, company-specific information.

Overview

Chat Aid integrates with 40+ data sources across multiple categories:

  • File Storage - Google Drive, Dropbox (Business & Personal), OneDrive, SharePoint, Custom Files
  • Knowledge Management - Notion, Confluence, ReadMe, Help Scout Docs, Coda, Zendesk Help Center
  • Helpdesk - Zendesk, Freshdesk, Intercom, Gorgias, Help Scout
  • Ticketing/Project Management - Jira, ClickUp, Trello, Asana, Monday.com, Linear, Azure DevOps, GitLab, Bitbucket, and more
  • CRM - HubSpot, Pipedrive, Copper, Freshsales, Salesforce, Zoho CRM, Capsule, Zendesk Sell, Close
  • HRIS - Humaans, Zoho People, GreytHR, HaileyHR, Square Payroll
  • Communication - Slack channels, Microsoft Teams channels
  • Marketing - ActiveCampaign, Accelo
  • Web Sources - Train on public or internal websites
  • Custom Files - Upload PDFs, Word docs, spreadsheets, ZIP archives, and more

How Data Sources Work

1. Connect

Securely connect Chat Aid to your data source using OAuth or API keys. Chat Aid requests only the minimum permissions needed to read your content.

2. Select

Choose which content to train on:

  • Specific folders or drives
  • Certain Notion workspaces or pages
  • Particular Slack channels
  • Selected Zendesk categories

3. Train

Chat Aid processes your content using either Advanced Training for optimized integrations (Slack, Teams, Notion, Confluence, etc.) or Standard Training for other sources:

  • Documents are intelligently chunked into searchable segments
  • Text is extracted from PDFs, images, and more
  • Structure and context are preserved
  • Embeddings are generated for semantic search
  • Metadata is integrated for better retrieval
Want better results?

Check out our Optimization Guide to learn how to structure your content for maximum accuracy.

When a question is asked:

  • Chat Aid searches across all your connected data sources
  • Relevant information is retrieved using semantic search
  • AI generates a comprehensive answer
  • Sources are cited for verification

Auto-Retrain (Automatic Updates)

Keep your knowledge base up-to-date automatically by setting up auto-retrain schedules for your data sources.

What is Auto-Retrain?

Auto-retrain automatically re-indexes your data sources on a schedule, ensuring Chat Aid always has the latest information without manual intervention.

Benefits:

  • 🔄 Always up-to-date information
  • ⏰ Set it and forget it
  • 📊 Automatic sync with source changes
  • 🎯 No manual retraining needed

Available Frequencies

FrequencyDescriptionAvailability
NeverManual retraining onlyAll plans
MonthlyRetrain once per monthAll plans
WeeklyRetrain once per weekTeam/Enterprise plans
DailyRetrain once per dayTeam/Enterprise plans
ContinuousReal-time updates via webhooksEnterprise plans (select sources)
Plan Requirements

Auto-retrain frequencies vary depending on your plan. To see which frequencies are available for your plan:

  1. Go to SettingsBilling
  2. View your current plan's features
  3. Upgrade your plan to access more frequent retraining options (Daily/Weekly/Continuous)

Setting Up Auto-Retrain

  1. Navigate to your data source:

    • Go to Integrations
    • Select your connected source (e.g., Google Drive, Notion)
    • Click on a specific document or folder
  2. Open Auto-Retrain Settings:

    • Click the ⋮ (three dots) menu next to the source
    • Select Auto-retrain
  3. Choose Frequency:

    • Never - No automatic updates (default)
    • Daily - Updates every day at midnight UTC
    • Weekly - Updates every Monday at midnight UTC
    • Monthly - Updates on the 1st of each month at midnight UTC
  4. Save Settings:

    • Click Confirm
    • Auto-retrain is now active for that source

How Auto-Retrain Works

Documents - Checks for updates, retrains if changed, updates knowledge, keeps chat history

Websites - Recrawls URLs, detects new/changed pages, updates index, follows crawl limits

Integrations - Syncs via API, pulls changes or new pages, re-indexes, keeps structure

Slack/Teams - Webhooks for real-time (if on), or batch syncs messages (daily/weekly/monthly)

Manual Retraining

You can manually retrain a source when allowed by your plan and timing restrictions:

  1. Navigate to the source
  2. Click the ⋮ menu
  3. Select Retrain Now
  4. Confirm the action

When Manual Retrain is Available:

Manual retraining is limited by your plan's training frequency and time since last training:

  • Monthly training frequency: Can retrain once every 30 days
  • Weekly training frequency: Can retrain once every 7 days
  • Daily training frequency: Can retrain once every 24 hours
  • Continuous training frequency: Can retrain every 12 hours (Slack) or 1 hour (other sources)

When Manual Retrain is Disabled:

You cannot manually retrain a source if:

  • ❌ Time restriction hasn't passed (based on your plan's frequency)
  • ❌ Source is archived
  • ❌ Your plan doesn't allow the training frequency needed
  • ❌ Training is currently in progress
Upgrade for Faster Retraining

Higher-tier plans offer more frequent training options. Check SettingsBilling to see available plan features.

Manual Retraining is Useful For:

  • Reflecting urgent changes once time restriction has passed
  • Testing after source updates
  • Refreshing data after authentication issues are resolved

Connecting Your First Data Source

Prerequisites

  • Admin access to the data source you want to connect
  • A Chat Aid account (any plan tier)

Step-by-Step

  1. Navigate to Integrations

    • Log in to Chat Aid dashboard
    • Click Integrations in the sidebar
  2. Find Your Data Source

    • Browse by category or use the search bar
    • Click on your desired integration
  3. Authenticate

    • Click Connect
    • Sign in to your data source
    • Grant the requested permissions
  4. Select Content

    • Choose which folders, pages, or channels to train on
    • Configure sync settings (auto-refresh, inclusion rules)
  5. Start Training

    • Click Start Training
    • Training begins immediately
    • You'll receive an email when complete

Managing Data Sources

Viewing Connected Sources

Navigate to Integrations to see all your connected data sources. Each source shows:

  • Status - Connected, Training, Error, or Paused
  • Last Synced - When data was last updated
  • Document Count - Number of documents indexed
  • Team - Which team has access (if using Teams feature)

Refreshing Data

To update data from a connected source:

  1. Go to Integrations
  2. Find your data source
  3. Click Refresh or Sync Now

Most sources auto-refresh daily. You can also configure webhooks for real-time updates.

Disconnecting

To fully disconnect a source:

  1. Go to Integrations
  2. Click on the data source
  3. Select Disconnect
warning

Disconnecting removes all trained data from Chat Aid. Questions won't be answerable from that source until you reconnect and retrain.

Supported Integrations

Chat Aid connects to 40+ data sources across multiple categories.

👉 Browse all integrations →

Categories include File Storage, Knowledge Management, Helpdesk, Ticketing, CRM, HRIS, Communication, and more.

Supported File Types

Chat Aid can process these file formats:

FormatExtensionCustom FilesGoogle DriveFile Storage
Google Docs-
Google Sheets-
Google Slides-
PDF.pdf
Word.docx
Excel.xlsx
PowerPoint.pptx
Text.txt
CSV.csv
JSON.json
ZIP Archives.zip
tip

For best results, use structured formats like Word, PDF, or Markdown. Chat Aid preserves formatting, headers, and structure for better context.

JSON Files - Limited Support

JSON files are supported with the following limitations:

  • Only top-level object properties are processed
  • Single arrays of objects are supported
  • Nested objects and nested arrays are not supported

For complex JSON data structures, consider converting to CSV or a document format.

ZIP Archives

ZIP files are automatically extracted when uploaded as custom files. Each extracted file must be under 100MB, with a maximum total extraction size of 2GB per ZIP archive. All files inside must be in supported formats.

Best Practices

1. Start with High-Value Sources

Connect data sources your team uses most frequently:

  • For support teams: Help desk + knowledge base
  • For engineering: Notion/Confluence + Jira
  • For sales: CRM + Google Drive (sales materials)

2. Organize Your Data

Well-organized data leads to better answers:

  • Use clear folder structures
  • Name documents descriptively
  • Keep documents up to date
  • Remove outdated content

3. Use Chat Aid's Teams feature for Sensitive Data

If you have department-specific information:

  • Create a team for that department
  • Connect data sources to that team only
  • Team members get access to both company and team data

Learn more: Teams Feature →

4. Match Auto-Refresh Frequency to Update Patterns:

  • Frequently updated docs → Daily or Weekly
  • Stable documentation → Monthly
  • Archived content → Never (manual only)

5. Monitor Quality

Track how well Chat Aid answers questions:

  • Review the Questions page regularly
  • Check which sources are being cited
  • Add more data sources if needed
  • Update stale content

Permissions and Security

What Access Does Chat Aid Need?

Chat Aid requests read-only access to your data sources whenever possible. We never:

  • Modify your documents
  • Delete any content
  • Share your data with third parties
  • Use your data to train any AI models

How Is My Data Protected?

  • Encryption in transit: TLS 1.2+ for all connections
  • Encryption at rest: AES-256 encryption in our database
  • SOC 2 Type II certified: Annual audits verify our security controls
  • ISO 27001 certified: International security standards
  • GDPR & CCPA compliant: Full compliance with data protection laws

Learn more: Security & Compliance →

Can I Control Who Sees What?

Yes! Use Chat Aid's Teams feature to isolate sensitive data:

  • Create department-specific teams
  • Connect data sources to specific teams
  • Only team members can see that data

Learn more: Teams Feature →

Troubleshooting

Connection Failed

If you can't connect a data source:

  • Verify you have admin permissions
  • Check your network/firewall settings
  • Try disconnecting and reconnecting
  • Contact support@chataid.com if the issue persists

Training Takes Too Long

For large data sources:

  • Training can take several hours
  • You'll get an email when complete
  • Chat Aid works on smaller batches while training continues

Sources Not Being Cited

If Chat Aid isn't using a specific source:

  • Verify the data trained successfully
  • Check the content is relevant to questions asked
  • Try asking more specific questions
  • Refresh the data source

Problem: Auto-retrain option is locked

Cause: Your plan doesn't include that frequency level

Solution: Upgrade your plan to access Daily/Weekly retraining or wait for the required time to pass

Problem: Source not retraining automatically

Check:

  1. Is auto-retrain enabled for that specific source?
  2. Has enough time passed since last retrain?
  3. Does the source still have valid authentication?

Solution:

  • Re-enable auto-retrain if needed
  • Wait for the scheduled time
  • Re-authenticate the connection and try again

Problem: Retrain failed

Common Causes:

  • Authentication expired
  • Source permissions changed
  • Source content was deleted
  • Service temporarily unavailable

Solution:

  • Check source connection status
  • Re-authenticate if needed
  • Verify source still exists
  • Contact support@chataid.com if issue persists

Ready to connect?