Fine-tuning is the process of adjusting and optimising a trained Asset before it is published. This involves further training an Asset using an additional document set to boost its accuracy and confidence score.
To evaluate whether fine-tuning is necessary for an Asset, you can view the Accuracy Results page after Asset training is completed, which provides an overview of both correctly and incorrectly predicted document types. By doing this, you can identify patterns or areas where fine-tuning can potentially improve the Asset’s performance.
Note: Fine-tuning is only applicable for trained Assets before they are published, not to published Assets.
Users must have any one of the following policies to fine-tune a Classifier Asset:
- Administrator Policy
- Creator Policy
This guide will walk you through the steps on how to fine-tune a Classifier Asset.
Step 1: Consider scenarios for fine-tune
The decision to fine-tune an Asset depends on your objectives, which are often specific Document types.
- Head to the Asset Studio page and select the trained Asset that you wish to fine-tune.
- In the Accuracy Results page that appears, check the Asset’s overall accuracy rate, Document type level accuracy and the confidence score.
Things to know
Document Type: In the context of Classifier Asset, a document type refers to the category or class that a document belongs to. For example, in a Classifier Asset, the document types could be “Invoice,” “Purchase Order,” “Receipt,” “Contract,” and more. Each of these represents a distinct category of documents.
Document Variation: Document variation refers to the different variations or instances within a specific document type. For example, various invoices could have different layouts, formats, or styles depending on factors like the vendor, company, or industry standards.
Overall Accuracy: The overall accuracy represents the percentage of correct predictions made by the Asset across all Document types.
Document type Accuracy: Document Type accuracy represents the percentage of correct predictions made by the Asset for individual Document types or categories in the test document set.
Confidence score: The confidence score is a measure of how confident the Asset is in its predictions for different Document types.
- You can consider fine-tuning the Asset in the following scenarios:
- To improve the overall accuracy of the Asset: Consider fine-tuning the Asset when the overall accuracy of the Asset is low.
- To improve the Document type level accuracy : Consider fine-tuning the Asset when the accuracy for a certain Document type is low.
- To improve the accuracy for specific document variations: Consider fine-tuning the Asset for specific document variations with low accuracy. For example, if you’re creating a Classifier Asset to classify invoices and purchase orders, and you notice low accuracy or confidence scores for invoices from specific vendors or purchase orders in certain formats, then you can initiate fine-tuning.
- To improve the confidence score: Consider fine-tuning the Asset when the confidence score for certain Document types or document variations is low.
- To improve the overall accuracy of the Asset: Consider fine-tuning the Asset when the overall accuracy of the Asset is low.
Step 2: Upload documents
After identifying areas for improvement in the Asset, it is recommended to have these required document sets for fine-tuning the Asset. If you have already uploaded the documents in Document Library, skip this step and proceed to Fine-tune.
Otherwise, upload the required documents in the Document Library. For more information about uploading documents, see Upload Documents.
Step 3: Initiate fine-tune
Note: It is important to be mindful that fine-tuning may also reduce the accuracy of the Asset when it is not properly performed with the appropriate document set and annotations.
- On the Accuracy Result page, click Fine-tune.
- In the Proceed to fine-tune window that appears, click Proceed.
Step 4: Select documents
- In the Document Sets pane, select or search for the document set.
- In the right page, select the required documents to fine-tune a Classifier Asset.
Note: Select a minimum of 10 documents to proceed for fine-tune. However, we recommend having a volume of 25 documents or more to provide a higher accuracy measure.
- Click Proceed to annotate the documents.
Step 5: Annotate and train
Annotation refers to the process of labelling documents against the Document types defined as part of the fine-tuning process.
For more information about annotating a Classifier Asset, see Annotate and Train.
Step 6: Review results and validate
This step allows you to assess the Asset’s predictions, accuracy, and confidence score. Additionally, you can utilise the Validate feature to evaluate the Classifier Asset’s performance on a new set of documents.
For more information on reviewing the results and validation, see Review results and validate.
Step 7: Publish the asset
If the desired accuracy has been achieved, you can proceed to Publish the Asset. For more information on publishing the Asset, see Publish the asset.