Orange Data Mining

Orange Blog

By: BIOLAB, Mar 27, 2012

Orange is again participating in GSoC

This year Orange is again participating in Google Summer of Code as a mentoring organization. Student proposal submission period is running and the deadline is on 6th April. We have prepared a page on our Trac with more information about the Google Summer of Code program, especially how the interested students should apply with their proposals. There is also a list of of some ideas we are proposing for this year but feel free to suggest your own ideas how you could contribute to Orange and make it even better.

Categories: gsoc

By: BIOLAB, Feb 5, 2012

Random decisions behind your back

When Orange builds a decision tree, candidate attributes are evaluated and the best candidate is chosen. But what if two or more share the first place? Most machine learning systems don’t care about it and always take the first, which is unfair and, besides, has strange effects: the induced model and, consequentially, its accuracy depends upon the order of attributes. Which shouldn’t be. This is not an isolated problem. Another instance is when a classifier has to choose between two equally probable classes when there is no additional information (such as classification costs) to help make the prediction.

Categories: tree

By: BIOLAB, Feb 2, 2012

New in Orange: Partial least squares regression

Partial least squares regression is a regression technique which supports multiple response variables. PLS regression is very popular in areas such as bioinformatics, chemometrics etc. where the number of observations is usually less than the number of measured variables and where there exists multicollinearity among the predictor variables. In such situations, standard regression techniques would usually fail. The PLS regression is now available in Orange (see documentation)! You can use PLS regression model on single-target or multi-target data sets.

Categories: multitarget pls regression

By: BIOLAB, Jan 23, 2012

Orange 2.5a2 available

Orange 2.5a2 has been uploaded to PyPI. It now includes basic support for multi-label classification (developed during the Google Summer of Code 2011), some new widget icons and documentation for basket format. Release is also tagged on our Bitbucket repository.

Categories: gsoc pypi release

By: BIOLAB, Jan 9, 2012

Multi-label classification (and Multi-target prediction) in Orange

The last summer, student Wencan Luo participated in Google Summer of Code to implement Multi-label Classification in Orange. He provided a framework, implemented a few algorithms and some prototype widgets. His work has been “hidden” in our repositories for too long; finally, we have merged part of his code into Orange (widgets are not there yet …) and added a more general support for multi-target prediction. You can load multi-label tab-delimited data (e.

Categories: classification gsoc mlc multilabel

By: BIOLAB, Jan 6, 2012

New Orange icons

As new and new widgets with new features are added to Orange, icons for them have to be drawn. Most of the time those are just some quick sketches or even missing altogether. But now we are starting to redraw and unify them. A few of them have already been made.

Categories: icons

By: BIOLAB, Jan 3, 2012

Parallel Orange?

We attended a NIPS 2011 workshop on processing and learning from large scale data. Various presenters showed different tools and frameworks that can be used when developing algorithms suitable for dealing with large scale data, but none of them were written in Python and as such, not useful for Orange. We have been looking for a framework that would help us run code in parallel for some time, but so far with no luck.

Categories: parallelization

By: BIOLAB, Dec 20, 2011

Earth - Multivariate adaptive regression splines

There have recently been some additions to the lineup of Orange learners. One of these is Orange.regression.earth.EarthLearner. It is an Orange interface to the Earth library written by Stephen Milborrow implementing Multivariate adaptive regression splines. So lets take it out for a spin on a simple toy dataset (data.tab - created using the Paint Data widget in the Orange Canvas): import Orange from Orange.regression import earth import numpy from matplotlib import pylab as pl data = Orange.

Categories: regression

By: MARKO, Dec 20, 2011

Orange 2.5: code conversion

Orange 2.5 unifies Orange’s C++ core and Python modules into a single module hierarchy. To use the new module hierarchy, import Orange instead of orange and accompanying orng* modules. While we will maintain backward compatibility in 2.* releases, we nevertheless suggest programmers to use the new interface. The provided conversion tool can help refactor your code to use the new interface. The conversion script, orange2to25.py, resides in Orange’s main directory. To refactor accuracy8.

Categories: orange25

By: BIOLAB, Dec 8, 2011

Random forest switches to Simple tree learner by default

Random forest classifiers now use Orange.classification.tree.SimpleTreeLearnerby default, which considerably shortens their construction times. Using a random forest classifier is easy. import Orange iris = Orange.data.Table('iris') forest = Orange.ensemble.forest.RandomForestLearner(iris, trees=200) for instance in iris: print forest(instance), instance.get_class() The example above loads the iris dataset and trains a random forest classifier with 200 trees. The classifier is then used to label all training examples, printing its prediction alongside the actual class value.

Categories: forestlearner simpletreelearner

‹ Prev ... 21 22 23 24 25 ... Next ›