Data Management¶
Most ML Tools are databses at their core¶
- Weights and Biases is a database of experiments
- Hugging Face is a database of models
- Label Studio is a database of labels
Platforms unifying structured and unstructured data¶
Data Exploration¶
- SQL
- Dataframe
- Pandas
Data Processing¶
Managing task dependencies/run models on schedule using a Directed Acyclic Graph workflow of data operations
Feature stores¶
Datasets¶
Data Labelling¶
- Self supervised learning - models can have elements of data masked and the model can use earlier parts of the data to preidct masked parts
- Image data augmentation
- Synthetic data
Labelling Solutions¶
-
Crowdsourced:
-
Full service: