Features

Allows users to: 

  • Quickly split files into test and training sets
  • Maintain an existing folder structure
  • Define a custom split between a test and training set (e.g. 80% of all documents should go to training)
  • Use custom seeds to reproduce splits, or generate entirely new ones
  • Allows for (manual) cross-folder validation with any Transformation product

Benefits

Never test on your training set! DocumentSetBuilder helps splitting thousands of files into two sets in seconds.


Inputs

A folder with as many sub directories as required, plus any kind of file (usually images or pdfs along with XDocuments).


Outputs

Two folders with the same folder structure and files split up according to the requested percentage.


Required Software / Applications

.NET Framework 4.5

Language Availability

English