Entropy as a measure of relevance//irrelevance

Entropy Agglomeration (EA) is the most useful algorithm you can imagine. It’s not cited and used only because the established scientific paradigms cannot conceive its meaning.

In fact, the idea is very simple:

In EA, entropy is a measure of relevance//irrelevance.

— Subsets of elements that either appear together or disappear together in the blocks have low entropy: Those elements are “relevant” to each other: They literally “lift up again” each other.

— Subsets of elements that are partly appearing while partly disappearing in the blocks have large entropy: Those elements are “irrelevant” to each other: They literally “don’t lift up again” each other.

This is all visible in the results of the analysis of James Joyce’s Ulysses: https://arxiv.org/abs/1410.6830

In this setup, entropy becomes a measure of irrelevance, literally and by definition: https://en.wiktionary.org/wiki/relevant

References:

I. B. Fidaner & A. T. Cemgil (2013) “Summary Statistics for Partitionings and Feature Allocations.” In Advances in Neural Information Processing Systems (NIPS) 26. Paper: http://papers.nips.cc/paper/5093-summary-statistics-for-partitionings-and-feature-allocations (the reviews are available on the website)

I. B. Fidaner & A. T. Cemgil (2014) “Clustering Words by Projection Entropy,” accepted to NIPS 2014 Modern ML+NLP Workshop. Paper: http://arxiv.org/abs/1410.6830 Software Webpage: https://fidaner.wordpress.com/science/rebus/

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s