Businesses face the challenge of efficiently extracting quick answers from lengthy documents. The Purple Fabric platform powered by GenAI revolutionizes this landscape, offering a comprehensive solution by allowing you to have a conversation with your documents, getting on-demand information from documents of nearly any type or format. It not only enhances efficiency by promptly addressing a wide range of queries, allowing employees to focus on complex tasks, but also ensures consistency in responses across all customer interactions. Moreover, its scalability guarantees seamless handling of growing inquiry volumes without compromising quality.
Users must have the Gen AI User policy to access the question and answer capability.
This guide will walk you through the steps on how to get answers for questions from documents with the help of Purple Fabric.
- Create an asset
- Select a prompt template
- Select a model and set model configurations
- Provide the system instructions
- Run the model and view results
- Validate and benchmark the asset
- Publish the asset
- Consume the asset
Step 1: Create an asset
- Head to the Gen AI Studio module and click Create Asset.
- In the Create Gen AI asset window that appears, enter a unique Asset name, for example, “Budget_Facts_Finder” to easily identify it within the platform.
- Optional: Enter a brief description and upload an image to provide additional context or information about your Asset.
- In Type, choose the Conversational Agent and click Create.
Step 2: Select a prompt template
- On the Gen AI Asset creation page that appears, choose Default Prompt template.
Step 3: Select a model and set model configurations
Select a Model
- Select a model from the available List, considering model size, capability, and performance. Refer to the table to choose the appropriate model based on your requirements.
LLM Model | Model Input – As per Platform configured | Model Output | Input Context Window(Tokens) | Output Generation Size(Tokens) | Capability and Suitable For |
Azure OpenAI GPT 3.5 Turbo 4K | Supports Text | Text | 4096 | 4096 | Ideal for applications requiring efficient chat responses, code generation, and traditional text completion tasks. |
Azure OpenAI GPT 3.5 Turbo 16K | Supports Text | Text | 16384 | 4096 | Ideal for applications requiring efficient chat responses, code generation, and traditional text completion tasks. |
Azure OpenAI GPT – 4o | Supports Text | Text | 128,000 | 16,384 | GPT-4o demonstrates strong performance on text-based tasks like knowledge-based Q&A, text summarization, and language generation in over 50 languages. Also, useful in complex problem-solving scenarios, advanced reasoning, and generating detailed outputs. Recommended for ReAct |
Azure OpenAI GPT – 4o mini | Supports Text | Text | 128,000 | 16,384 | A model similar to GPT-4o but with lower cost and slightly less accuracy compared to GPT-4o. Recommended for ReAct |
Bedrock Claud3 Haiku 200k | Supports Text + Image | Text | 200,000 | 4096 | The Anthropic Claude 3 Haiku model is a fast and compact version of the Claude 3 family of large language models. Claude 3 Haiku demonstrates strong multimodal capabilities, adeptly processing diverse types of data including text in multiple languages and various visual formats. Its expanded language support and sophisticated vision analysis skills enhance its overall versatility and problem-solving abilities across a wide range of applications. |
Bedrock Claude3 Sonnet 200k | Supports Text + Image | Text | 200,000 | 4096 | Comparatively more performant than Haiku, Claude 3 Sonnet combines robust language processing capabilities with advanced visual analysis features. Its strengths in multilingual understanding, reasoning, coding proficiency, and image interpretation make it a versatile tool for various applications across industries |
Set Model Configuration
- Click and then set the following tuning parameters to optimize the model’s performance. For more information, see Advance Configuration.
Step 4: Provide the system instructions
A system instruction refers to a command or directive provided to the model to modify its behavior or output in a specific way. For example, a system instruction might instruct the model to summarize a given text, answer a question in a specific format, or generate content with a particular tone or style.
- In the System Instructions section, enter the system instructions by crafting a prompt that guides the agent in summarizing content.
Step 5: Run the model and view results
- In the Debug and preview section, click
- In the Knowledge files window that appears, upload the required documents that you wish to include and seek answers from.
- In the query bar, enter the prompt to seek answers from the document uploaded.
- Click or press Enter key to run the prompt.
- You can get the response based on your queries.
- Review the generated response to ensure it adequately addresses or clarifies your query.
- Click Reference if you wish to view the reference of the output.
- Select the respective field information to view its reference.
- Click Reference if you wish to view the reference of the output.
- If necessary, provide examples to enhance the conversational agent’s understanding and response accuracy for answering questions.
Note: If the answer falls short of your expectations, provide additional context or rephrase your prompt for better clarification. You can also try changing to a different model.
Step 6: Validate and benchmark the asset
Benchmarking allows you to compare the performance of different models based on predefined metrics to determine the most effective one for your needs.
- In the Debug and preview section, click Benchmark.
- In the Benchmark window that appears, click Start New to begin setting up the benchmarks.
Add Input and Expected Output
- On the Benchmark page, click .
- In the Input and Expected output field that appears , enter the example input/prompt and the expected output.
- Click to add another model to benchmark the response against.
- Click Re-run prompt.
Add additional Benchmark
- On the Benchmark page that appears, click to add additional models to benchmark against.
- In the Benchmark window that appears, click Model and prompt Settings.
- In the Model and Prompt Settings window, choose another model for comparison and then click Save
- Click and adjust the metrics to optimize the model’s performance. For more information, see Advance Configuration.
- Click Save to add the model to the Benchmark.
Validate
- On the Benchmark page, click Re-run prompt to generate responses from the models.
- You can view the responses in the Benchmark model section,
- Compare the response of the models based on the tokens, score, latency and cost to determine the best-suited model for your use case.
- Preview, like or dislike the results to share feedback with fellow team members.
Definition of Metrics
- On the benchmark page, click metrics. to define and adjust metrics settings.
- In the Metrics that appears, choose the following metrics that you wish to compare with the models.
- Cost: Cost refers to the financial expenditure associated with using the language model. Costs vary based on the number of tokens processed, the level of accuracy required, and the computational resources utilized.
- Latency: Latency refers to the time delay between a user’s input and the model’s output.Latency can be influenced by various factors such as the complexity of the task, the model’s size, and the computational resources available. Lower latency indicates faster response times.
- Tokens: “tokens used” typically refers to the number of these units processed to generate a response. Each token used consumes computational resources and may be subject to pricing.
- Rough L: Rouge L calculates the longest common subsequence between the generated text and the reference text. It evaluates the quality of the generated text based on the longest sequence of words that appear in both texts, regardless of their order
- Answer Similarity: Answer Similarity measures how similar the generated answers are to the reference answers. It can be computed using various similarity metrics such as cosine similarity, Jaccard similarity, or edit distance.
- Accuracy: Accuracy measures the correctness of the generated text in terms of grammar, syntax, and semantics. It evaluates whether the generated text conveys the intended meaning accurately and fluently, without errors or inconsistencies
- Cost: Cost refers to the financial expenditure associated with using the language model. Costs vary based on the number of tokens processed, the level of accuracy required, and the computational resources utilized.
- You can view the selected metrics against the models.
Step 7: Publish the asset
- If the desired accuracy and performance for getting answers from the document has been achieved, click Publish.
- In the Asset Details page that appears, enter the Welcome Message, Conversation Starters and Asset Disclaimer.
- Optional: Upload a sample image for a visual representation.
- Click Publish and the status of the Asset changes to Published. It can be accessed in the Gen AI Studio.
Step 8: Consume the asset
- Head to the GenAI Studio module. Use the Search bar to find an Asset.
- Select an Asset that you wish to consume.
- In the Conversational Assistant that appears, initiate a conversation by asking the asset a question based on the product manual. An example could be “What are the key takeaways from the budget?”