REBUS 2.0

rebus2REBUS 2.0 is a tool for analyzing elementary compositions.

It implements Generalized Entropy Agglomeration (GEA) described in the report with that name.

In recording and analyzing an elementary composition:

1) Elements are distinguished from one another by a number of categorically different indices (e.g. plant species are distinguished by their English names, as shown in the figure).

2) The compositions of the Elements are indicated by a number of Blocks within which the Elements occur (e.g. particular groups of plant species occur in certain shared metabolic interactions with particular genera of Mycorrhizal fungi).

3) Individual Blocks can be weighted according to their amount of presence (e.g. the interactions that have greater presence in nature can be made to weigh more heavily on the result).

4) Individual Elements’ individual occurrences in individual Blocks can be weighted (e.g. particular plant species may have higher presence in certain Mycorrhizal interactions. These Element Weights are called “rational occurrence numbers” in the report).

5) A “recurrence base” parameter is set as high as the larger rational occurrence numbers in the dataset. A small recurrence base leads to negative entropies, but if it’s too high, unbalanced dendrograms are produced.

6) The Python file “rebus.py” is run and the dendrograms are produced (See REBUS 2.0 in the GIT repository)

7) To be able to analyze numerical datasets like the famous Iris dataset, a numerical categorization procedure (described in the report) is provided with the code (“iris_convert.py”). REBUS 2.0 is able to cluster 145 of the 150 flowers in the Iris dataset correctly.

Here is a short video tutorial:

REBUS 2.0 is under GNU General Public License.

REBUS 2.0 in the GIT repository

Entropy as a measure of relevance//irrelevance

See also: REBUS 1.0

On REBUS 2.0, you can refer to my report:

Fidaner, I. B. (2017) Generalized Entropy Agglomeration, arxiv.org

On REBUS 1.0, you can refer to our workshop paper:

Fidaner, I. B. & Cemgil, A. T. (2014) Clustering Words by Projection Entropy, accepted to NIPS 2014 Modern ML+NLP Workshop.

On entropy agglomeration (EA), you can refer to our conference paper:

Fidaner, I. B. & Cemgil, A. T. (2013) Summary Statistics for Partitionings and Feature Allocations. In Advances in Neural Information Processing Systems, 26.

Have questions on REBUS, EA, PE, COD, etc.? Write to