1. Overview
DataReadIA is our first product of the “Data Insight Seeking” collection, a complete tool to help the Data Science/Analytics community to make “speaking” its data.
DataReadIA, shortly of “Data-Ready-for-IA”, is a set of applications aiming to get your data ready for any Data Science project. Indeed, your imported raw data will be (click here for detail):
- checked,
- cleaned,
- transformed
- enriched.
Why DataReadIA?
The first and second steps (checking and cleaning) are common tasks you could find in different softwares, some do these tasks correctly, others not well or not better.
DataReadIA takes the checking and cleaning steps to a next level, especially for huge amount of date-time formats and categorical data type, using our advanced format-Inferer agent.
For the transforming and enrichment steps, DataReadIA brings a big difference to your Data team. Based on the question
Who knows better than a Data Scientist to prepare the data for the Modeling task of Data Science/Analytics teams???
DataReadIA algorithms are designed and developed by our Machine-Learning Engineer team, with scientific logics and engineering competence.
2. Application structure
DataReadIA is composed of three components: Algorithms, Studio and APIs, distributed as follows:
Each component is separately Docker-containerized, designed for integration into a Kubernetes cluster.
The Algorithms component is the base layer and abstract to end users.
The Studio and APIs upper layers constantly interact with the Algorithms catalogue in order to bring the chosen algorithm back to the end user.
The user authentication and authorization need to be verified before connection to the Studio and before each call to API via a time-limited token.