# Training Material

### What is Nanonets?&#x20;

Nanonets is a document data extraction platform. Customers typically use Nanonets in 2 ways, they either use our pre-trained models or they build their own custom OCR model to automate manual data entry. We extract data in a structured key-value pair format for you directly consume it.&#x20;

{% embed url="<https://www.youtube.com/watch?v=-xlaRA7HYNQ>" %}

### How do customers typically use Nanonets?

Customers use Nanonets to automate manual data entry for any document type.&#x20;

Input = Upload Image/PDF file via our UI or API&#x20;

Output = Download structured response consumed either via our API or .csv/.xslx through our UI

![](https://s3.amazonaws.com/helpjuice-static/helpjuice_production%2Fuploads%2Fupload%2Fimage%2F6866%2Fdirect%2F1606745210064-1606745210064.png)

Nanonets sample API response structure

### How does Nanonets differ from Abbyy/Kofax/Docparser?

Our team roots come from deep learning applied to computer vision. We don’t learn document templates like Abbyy, Kofax and Docparser, but actually learn the document. You can get started on Nanonets  from day 1 without any template or rules setup. We are also developer-first with easy-to-integrate REST APIs and documentation for the same.&#x20;

### What is a pre-trained model?

We support data extraction from some popular document types out-of-the-box - invoices, receipts, drivers’ license, passport to name a few. We have a set of specified fields that we extract from these document types. You can always add a new custom field that you may need to extract.

### What file formats are supported?

We support all major file and image formats like PDF, JPEG, PNG, TIFF, etc.

### What is a custom OCR model?

Nanonets allows you to train a model to extract specific labels from your document type without writing a single line of code.&#x20;

### How do you train a custom OCR model?

Watch this 2 minutes video

{% embed url="<https://www.youtube.com/watch?v=LnOMJDtCCNY&feature=emb_logo>" %}

### How many files are required to train an OCR model?

We require a minimum of 50 images to train a custom model. We recommend starting with 50 and adding files depending on the accuracy you see. For a complicated document type you might need 1000 or more files.

### How long does it take to train a custom OCR model?

Training usually takes between 20 mins - 2 hours depending on the number of files and queued models for training. In case you are facing a longer time you can choose to upgrade your model to a paid plan to be moved to the front of the queue and get more compute resources allocated.&#x20;

### How do I correct API predictions before consuming the response?

You can use our “Verification” feature. Watch this 3 min video to understand more

{% embed url="<https://www.youtube.com/watch?v=TDv71M_iNiQ>" %}

### Can I run the Nanonets solution on-premises?

Yes we support on-premises deployments. You can learn more about it here:

<https://nanonets.com/help/security/do-you-have-an-on-premises-solution>

### Can I capture Table data using Nanonets?

Yes absolutely. We support table data extraction using Nanonets. You even specify the columns and rows of interest while training a model. For a more in depth explanation, watch this 4-minute [video](https://nanonets.com/help/ocr/how-to-annotate-tables-or-line-items-).

![Nanonets sample table capture](https://s3.amazonaws.com/helpjuice-static/helpjuice_production%2Fuploads%2Fupload%2Fimage%2F6866%2Fdirect%2F1606745210183-1606745210183.png)

Watch this 4 min video:

{% embed url="<https://www.youtube.com/watch?v=-sujCeE8veI>" %}

### What is model re-training?

You can add your own or new data to improve accuracy of a pre-trained model. This process involves uploading data, labelling the documents correctly and training the model.&#x20;

### Other helpful links

[What are the best practices to train high accuracy custom OCR models?](https://nanonets.com/help/ocr/best-practices-for-high-accuracy-models)

[How to finetune the invoice/receipts model and add my own fields?](https://nanonets.com/help/ocr/train-your-own-invoice-model)

[How do I run an OCR model on-premises via docker?](https://nanonets.com/help/ocr/what-are-the-instructions-to-run-a-nanonets-docker-image)

[How to handle and correct missed/wrong predictions? ](https://nanonets.com/help/ocr/how-to-fix-predictions-on-an-ocr-model)

[How to annotate tables or line items?](https://nanonets.com/help/ocr/how-to-annotate-tables-or-line-items-)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://partners.nanonets.com/training-material.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
