URLs


Overview

The URLs feature lets you feed web content directly into your bot’s knowledge base. By adding links to relevant web pages, the bot will crawl and extract text so it can answer customer questions more accurately.

Each URL has a learning status:

  • Imported: Successfully learned and integrated.

  • In Queue: Still processing.

  • Overlapped: Content overlaps with existing knowledge (duplicate/similar).

Figure 1. Training Data - URLs

Steps to Use

  1. Access the URLs tab

  • Go to Training DataURLs in your bot’s dashboard.

  1. Add a new URL

  • Enter the full link (max 255 characters) in the input field.

  • Click Add URL → the link goes into the queue for processing.

  1. View existing URLs

  • See all added links, creation time, status, and actions.

  • Use pagination for long lists.

  1. Interpret status

  • Imported → URL processed successfully.

  • In Queue → still crawling.

  • ⚠️ Overlapped → duplicate or similar content detected.

  1. Re-crawl URLs

  • Use Recrawl all URLs (top right) to refresh data.

  1. Delete URLs

  • Click 🗑️ to remove a URL (this also deletes its learned content).


Example / Use Case

A company adds:

  • Privacy policy page

  • Product data sync documentation

The bot then learns directly from these pages and can answer questions like “How does data sync work?” or “Where is my data stored?”


Notes

  • Processing may take time depending on page size.

  • Only text content is extracted (no images, videos, or scripts).

  • Avoid adding many pages with similar content → this can reduce accuracy.

  • Deleting a URL removes all knowledge derived from it.


Quick Tips

  • Add official documentation or policy pages for accurate answers.

  • Keep URLs stable (avoid links that change often).

  • Periodically use Recrawl to refresh outdated content.

Troubleshooting

  • URL won’t add → Check link format (http/https) and length ≤255 characters.

  • Status stuck in Queue → Wait, then try Recrawl.

  • Overlapped warning → Remove duplicate or redundant links.

  • Bot not using content → Ensure the page has visible text (not just images or dynamic scripts).

Last updated