Orange Data Mining

Orange Blog

By: Noah Novšak, Oct 24, 2023

Dask all Folks: preparing large datasets

Preparing large HDF5 datasets that load into Orange as on-disk data. This post outlines the specifics of the HDF5 format used by Orange and provides Python code that will help you prepare your own large datasets.

Categories: dask development

By: Martin Špendl, Oct 20, 2023

Recap of 26th International Conference on Discovery Science

Two of our PhD students presented their work on the 26th International Conference on Discovery Science in Porto.

Categories: conference research

By: Erika Funa, Oct 17, 2023

Fall Season Brings Fresh Content to the Introduction to Data Science Series

Updates in the Introduction to Data Science Video Series, new video logistic regression nomogram.

Categories: update video series

By: Žan Mervič, Sep 19, 2023

Why Removing Features Isn't Enough

In this blog, we confront the common misconception that merely removing a protected attribute from a dataset eliminates bias in model predictions. Our case study reveals that models trained without these attributes still produce biased results. This is due to feature correlations that indirectly capture the protected information. Our conclusion? You cannot sidestep the need for specialized fairness algorithms.

Categories: fairness

By: Žan Mervič, Sep 19, 2023

Orange Fairness - Reweighing as a preprocessor

Diving deeper into the Orange fairness Reweighing widget, we explore its use as a preprocessor for models. Discover the new widgets and fairness scoring metrics; all illustrated using the German credit dataset, supplemented with visual insights through box plots.

Categories: fairness reweighing

By: Žan Mervič, Sep 19, 2023

Orange Fairness - Reweighing a Dataset

Building on our exploration of the Orange fairness addon, this blog delves into the Reweighing widget. By adjusting weights for dataset instances, the widget addresses bias, focusing on underrepresented groups. Using the Compas dataset as an example, we demonstrate how bias decreases post-reweighting, presenting visual insights into the distribution of adjusted weights and their impact on dataset fairness.

Categories: fairness reweighing

By: Žan Mervič, Sep 19, 2023

Orange Fairness - Equal Odds Postprocessing

In this blog, we delve into the Equal Odds Postprocessing widget, a tool designed to enhance fairness in machine learning models. We break down how the algorithm works by modifying predictions to meet Equalized Odds criteria. Using a real-world example with the German credit dataset, we demonstrate its efficacy in improving fairness metrics while marginally affecting accuracy.

Categories: fairness equal odds postprocessing

By: Žan Mervič, Sep 19, 2023

Orange Fairness - Adversarial Debiasing

This blog post focuses on the Adversarial Debiasing model in Orange, a tool for enhancing fairness in your machine learning algorithms. We will walk through how to use it and explain the trade-offs that come with using fairness algorithms.

Categories: fairness adversarial debiasing

By: Žan Mervič, Sep 18, 2023

Orange Fairness - Dataset Bias

In an era where AI drives decisions impacting real lives, fairness in machine learning is paramount. Take the `Adult` dataset, which shows discrepancies in salary predictions based on gender. Addressing such concerns, Orange introduces a fairness add-on. Using new widgets, users can identify and mitigate biases in their datasets or model predictions.

Categories: fairness dataset bias