Contextual Extraction

An enterprise platform powered by GenAI with automated extraction capabilities streamlines document handling for users. Utilizing Artificial Intelligence, the platform automatically identifies and extracts relevant data from various sources, reducing manual efforts and minimizing errors. This enables business users to perform easy and efficient extraction, saving time, improving data accuracy, and boosting overall productivity.

Users must have the Gen AI User policy to access the extraction capability.

This guide will walk you through the steps on how to create an Extraction Agent.

Create an asset
Select a prompt template
Select a model and set model configurations
Provide the system instructions, parameters, output schema and examples
Run the model and view results
Publish the asset

Step 1: Create an asset

Head to the Gen AI Studio module and click Create Asset.
In the Create Gen AI asset window that appears, enter a unique Asset name, for example, “NACH_Mandate_Extractorr” to easily identify it within the platform.
Optional: Enter a brief description and upload an image to provide additional context or information about your Asset.
In Type, choose the Automation Agent and click Create.

Step 2: Select a prompt template

On the Gen AI Asset creation page that appears, choose Default Prompt template.

Step 3: Select a model and set model configurations

Select a Model

Select a model from the available List, considering model size, capability, and performance. Refer to the table to choose the appropriate model based on your requirements.

LLM Model	Model Input – As per Platform configured	Model Output	Input Context Window(Tokens)	Output Generation Size(Tokens)	Capability and Suitable For
Azure OpenAI GPT 3.5 Turbo 4K	Supports Text	Text	4096	4096	Ideal for applications requiring efficient chat responses, code generation, and traditional text completion tasks.
Azure OpenAI GPT 3.5 Turbo 16K	Supports Text	Text	16384	4096	Ideal for applications requiring efficient chat responses, code generation, and traditional text completion tasks.
Azure OpenAI GPT – 4o	Supports Text	Text	128,000	16,384	GPT-4o demonstrates strong performance on text-based tasks like knowledge-based Q&A, text summarization, and language generation in over 50 languages. Also, useful in complex problem-solving scenarios, advanced reasoning, and generating detailed outputs. Recommended for ReAct
Azure OpenAI GPT – 4o mini	Supports Text	Text	128,000	16,384	A model similar to GPT-4o but with lower cost and slightly less accuracy compared to GPT-4o. Recommended for ReAct
Bedrock Claud3 Haiku 200k	Supports Text + Image	Text	200,000	4096	The Anthropic Claude 3 Haiku model is a fast and compact version of the Claude 3 family of large language models. Claude 3 Haiku demonstrates strong multimodal capabilities, adeptly processing diverse types of data including text in multiple languages and various visual formats. Its expanded language support and sophisticated vision analysis skills enhance its overall versatility and problem-solving abilities across a wide range of applications.
Bedrock Claude3 Sonnet 200k	Supports Text + Image	Text	200,000	4096	Comparatively more performant than Haiku, Claude 3 Sonnet combines robust language processing capabilities with advanced visual analysis features. Its strengths in multilingual understanding, reasoning, coding proficiency, and image interpretation make it a versatile tool for various applications across industries

Set Model Configuration

Click and then set the following tuning parameters to optimize the model’s performance. For more information, see Advance Configuration.

Step 4: Provide the system instructions, parameters, output schema and examples

Provide System Instructions

A system instruction refers to a command or directive provided to the model to modify its behavior or output in a specific way. For example, a system instruction might instruct the model to summarize a given text, answer a question in a specific format, or generate content with a particular tone or style.

Enter the system instructions by crafting a prompt that guides the agent in extracting the data.

Add Parameters

In the Parameter section, click Add.
Enter the following information.
- Name: Enter the Name of the input parameter.
- Type: Choose File as the data type.
- Description: Enter the Description for each of the input parameters. The description of the parameters ensures accurate interpretation and execution of tasks by the Gen AI Asset. Be as specific as possible.
Click against the input parameter to access settings and add input field settings.
Choose the required file formats (PDF, JPEG, JPG, TIFF, PNG) from the drop-down menu.
Select a chunk strategy for file inputs. The chunking strategy can be applied in Page, Words, and Block.
Click Save to proceed.

Define Output Schema

In the Output section, click Add to define the output schema for the Asset.
Enter the Variable Name, Type and Description for each of the output variables. Supported types include Text, number, Boolean, DateTime.

Provide Examples

Examples help the summarization task at hand to enhance the agent’s understanding and response accuracy. These examples help the agent learn and improve over time.

In the Examples section, click Add.
Provide the Context and Answer in the example section.

Step 5: Run the model and view results

In the Debug and preview section, browse and add the required document.
Click Run to get the results for extraction in the required format.
Review the generated output. Verify the classification by checking if the class is marked as true (indicating the data is classified as that class). If marked as false, the data is not classified as that class.
- Click Reference to view additional information or context about the classification results, such as the source data, detailed explanations, and relevant metadata.
- Select the respective References to view its information.

Note: If you are not satisfied with the results then, try modifying the System Instructions and the description of the output variables. You can also try changing to a different model

View Trace

If you wish to view the traces of the prompt and the result, click View trace.
In the Trance window that appears, review the trace.

Step 6: Publish the asset

Click Publish if the desired accuracy and performance for summarizing the content has been achieved.
In the Asset Details page that appears, write a description and upload an image for a visual representation.
Click Publish and the status of the Asset changes to Published then it can be accessed in the Gen AI Studio.

Note: Once the Asset is published, you can download the API and its documentation. The API can be consumed independently or used within a specific Use case. If you wish to consume this Asset via API, see Consume an Asset via API.

You can also consume this automation Asset in the Asset Monitor module. For more information, see Consume an Asset via Create Transaction.