URLs
Overview
The Training Data - URLs feature in CX Genie allows you to add, manage, and monitor URLs that your bot uses to learn and enhance its knowledge base. By providing URLs to relevant web pages, the bot automatically crawls and extracts useful information to answer user queries more accurately.
The bot's knowledge status for each URL is categorized into three states:
Imported: The bot has successfully learned and integrated the content from the URL.
In Queue: The bot is currently processing and learning from the URL.
Overlapped: The content from the URL is overlapping with existing knowledge, typically due to similar or duplicated questions.
Steps to Use
Access the URLs tab
Navigate to the Training Data section in the CX Genie dashboard and select the URLs tab.
Add a new URL
Enter the full URL in the Input the URL field at the top.
The input supports up to 255 characters.
Click the Add URL button to submit. The URL will then be queued for crawling and learning by the bot.
View existing URLs
The list below shows all added URLs, their creation date/time, current learning status, and available actions.
Scroll through the list or use pagination to browse multiple pages.
Interpret the status
A green checkmark means the URL content has been successfully learned (Imported).
A numeric progress indicator with the label In Queue means the bot is currently processing the URL data.
The Overlapped label indicates that the content overlaps with existing knowledge, often due to similar or duplicate questions.
Re-crawl URLs
To refresh and update the data from all URLs, click the Recrawl all URLs button at the top right.
Delete URLs
Use the trash can icon in the Actions column to remove any URL from the list. This will also remove the associated learned data from the bot.
Example / Use Case
Imagine you want your bot to answer questions about your company’s privacy policies and product synchronization features. You add URLs such as your privacy policy page and product data sync documentation. The bot crawls these URLs, learns the content, and can then respond to user inquiries based on this up-to-date information.
If two URLs have very similar content (e.g., different pages covering the same topic), the system flags one as Overlapped, so you can review and decide whether to keep or remove duplicates to keep the knowledge base clean.
Notes
The bot's learning process may take some time after adding a new URL. You can monitor progress in the Status column.
Only text-based content from the URLs is learned; multimedia or scripts are not processed.
URLs that cause content overlap might reduce the bot’s response accuracy. It's recommended to avoid adding multiple URLs with very similar or duplicated content.
When you delete a URL, all associated knowledge from that URL is also removed from the bot's knowledge base.

Last updated