- Click
and then set the configurations to optimize the model’s performance, if you wish to fine-tune the configurations of your Agent.
Lookback Limit
You can configure this setting if you are creating a Conversational Agent. This is not applicable for Automation Agent. Define the ‘Lookback Limit’ to control how much historical chats the asset can refer to when generating responses. is crucial for tasks that benefit from contextual continuity.
- Choose any limit from 0 to 50 and control the amount of historical chat considered.
Model Settings
- Temperature: Choose any limit from 0 to 2 to control the randomness of the generated content.
For example, lower values produce more deterministic results, while higher values yield more creative and diverse answers. - Top P: Choose any limit from 0 to 1 to set a threshold for cumulative probability during word selection, refining content by excluding less probable words.
For example, setting top_p to 0.7 ensures words contributing to at least 70% of likely choices are considered, refining responses. - Vision (Applicable only for Bedrock Cloude3 Haiku model): Enable Vision option to decide if visual data processing is required.
Advanced Retrieval
You can configure these settings if you are using the RAG prompt template.
- Top K: Choose any limit from 1 to 20 to limit the AI model to considering only the most probable words for each token generated, aiding in controlling the generation process.
For example, setting top_k to 10 ensures that only the top 10 most likely words are considered for each word generated.
- Reranker: Enable the Reranker to organize initial chunks ensuring that the most relevant and accurate results are prioritized during the response generation process. This helps in refining and improving the quality of the Agent generated content by reorganizing the initial candidate answers or information chunks based on relevance criteria.
- Reranker Model: Choose any one of the following models for reranking:
- Cohere
- Colbert
- Cohere
- Top N: Choose any limit from 1 to 100 to rerank the top N most relevant and accurate chunks of information, refining output by prioritizing candidate responses or segments that best meet user needs. This setting enhances response quality by focusing on key information segments, ensuring effective and tailored content delivery.
For example: Suppose a user asks, “Can you explain the benefits of cloud computing?” With Top N set to 3, the AI can identify and rerank the top 3 most relevant chunks of information related to cloud computing benefits. These chunks might include scalability advantages, cost-efficiency considerations, and enhanced data security features. By focusing on these key aspects, the AI delivers a well-organized and informative response that highlights the most significant benefits of cloud computing, tailored to the user’s query.
- Multi Query: Enable this option to improve response accuracy by generating and collating multiple perspectives, especially for poorly phrased questions. This approach is particularly effective for poorly phrased questions because it increases the chances of capturing the intended meaning from different angles, thereby improving the overall relevance and completeness of the AI’s responses.
For example: Imagine a user asks, “What are the effects of climate change?” The Agent, enabled with Multi Query, can generate responses by considering multiple perspectives—such as environmental, economic, and societal impacts—providing a comprehensive answer that covers various facets of the issue.
- Self Query: Enable for more accurate and relevant responses using detailed metadata queries. This feature is particularly beneficial as it allows to tailor responses closely to the specific context and details provided within the query itself, leading to more accurate and useful information retrieval.
For example: If a user asks, “How does quantum computing work?”, the Agent, utilizing Self Query, can refine its response by analyzing specific metadata within the query, such as focusing on explaining quantum computing principles and algorithms without delving into unrelated topics like classical computing, thus ensuring the response is directly tailored to the user’s query. - Query Planner: Enable this option to break down complex queries, improve cross-document referencing.This feature helps the AI to efficiently analyze and retrieve information from multiple sources or documents, ensuring comprehensive and accurate responses to intricate queries.
For example: When a user asks, “Compare the economic impacts of renewable energy versus fossil fuels,” Query Planner helps the AI to systematically break down this complex query. It can analyze economic data from various documents, comparing cost structures, environmental impacts, and market trends across renewable energy and fossil fuel sectors to provide a well-rounded and informative response.
- Smart Retrieval: Enable this option to allow the AI agents smartly choose the best response approach for each query, using the Knowledgebase or existing knowledge as needed.
For example: Imagine a user asks, “How does blockchain technology impact supply chain management?” With Smart Retrieval enabled, the AI can select the best approach to answer this query. It might retrieve information from industry reports, case studies, and academic papers to provide a comprehensive analysis of blockchain’s effects on supply chain transparency, efficiency, and security. This ensures the response is well-informed and tailored to the specific aspects of the user’s question.
On-demand File
- RAG vs No RAG for On-Demand: Enable this option to allow users to access RAG (Retrieval-Augmented Generation) for on-demand files, allowing the model to incorporate knowledge bases linked to on-demand files for generating more informed responses. This feature enhances response accuracy and depth by leveraging external sources of information directly relevant to user queries.
- Chunking Strategy (Applicable for RAG): Choose any one of the following for the on-demand files.
- Block: Choose this option if you wish to chunk the documents by blocks. Suitable for documents with diverse sections or topics where each block may represent a distinct segment requiring individual processing.
- Page: Choose this option if you wish to chunk the documents by pages. Appropriate for documents with consistent and uniform content, where dividing by page ensures even distribution and manageable sections.
- Word: Choose this option if you wish to chunk the documents by words. Beneficial for content where word-level context is paramount, ensuring that the model processes a specific number of words for accuracy and coherence.
- Block: Choose this option if you wish to chunk the documents by blocks. Suitable for documents with diverse sections or topics where each block may represent a distinct segment requiring individual processing.