Frequently Asked Questions
What does Synthetic Images offer?
We provide training data services and tools that simplify the development and accelerate the deployment of deep learning systems. The scope of our deliverables depends on the customers' requirements and objectives. A distinct component of our offering is synthetic training data.
What are synthetic images?
We define synthetic images as pictures generated using computer graphics, simulation and machine learning methods, to represent 2D or 3D scenes with high photorealism.
What are synthetic image datasets used for?
1. Synthetic image and label datasets are primarily used to train deep learning computer vision models. 2. They can be used to validate a machine vision algorithm, where exhaustive validation data is not available. 3. Additionally, they are a form of digital twin at a component level, that help archive know-how.
How do you create synthetic images?
1. We use a mix of computer generated imagery (CGI), visual effects (VFX) and machine learning methods to create a diverse set of unique images. 2. Required inputs include a description of the use case from the domain expert, 3D files (if available) and a handful of reference images of relevant objects and the scene. 3. The basis for image creation is 3D file of the object of interest. 4. After digitally recreating the relevant scene, we ensure diversity in the dataset by representing each scene variable in the complete spectrum of its plausible values. 5. To avoid overfitting, we generate a few images with random variables, e.g. distractors.
Do I have to label synthetic images myself?
No, you can skip this step completely. Synthetic images we deliver are accompanied by labels, which are created in the generation process. Customers can choose between a range of labels, depending on the computer vision task at hand. Click here to know more about label types we offer.
Why should one consider the use of synthetic data?
1. Save time and get high quality data on demand : Existing methods to capture and label training datasets are time and effort intensive, and subject to human error. Using synthetic datasets, you can save up to 75% time while having complete control over the generated data. 2. No more missing data and imbalanced datasets : Certain applications like quality inspection demand a large number of images of NOK / defect parts, to train a robust deep learning model. It is impractical to capture hundreds or thousands of such images from a production facility. With synthetic data, you can represent the classes which are underrepresented. 3. Explainable AI : In times when AI explainability is a relevant topic, complete control over training data helps deep learning teams understand AI model behavior better.
What is unique about data generated by Synthetic Images?
1. The process : Our images are created using state-of-the-art digital tools, where the only inputs required are 3D files, a handful of captured reference images, and a description of the use case and features by the end-user. 2. Right balance between quantity and diversity : We generate custom images specific to a use case, taking into consideration all the variables that the scene includes. This ensures context without overfitting the data. 3. Quality of labels and photorealism : The quality of images we generate has been proven in real industrial projects. Deep learning networks trained using our images have achieved up to 100% precision, when validated against captured images from the customer.
How realistic do synthetic images appear?
Images we generate achieve a high degree of photorealism. It is important to note that artificial neural networks 'perceive and learn' differently than humans. You can see some examples of captured vs synthetic images here.
How long does the dataset delivery take?
Our current delivery time for datasets is 3–5 weeks after receiving specifications. This includes a kickoff meeting, a round of review and approval, and a final meeting after delivery of the data.
Can I create a dataset on my own?
Our self-service software is currently under development - please contact us for more details. In the meanwhile, we will provide you with the datasets on a project / full service basis.