These are the notes for using REBUS 1.0. Return to mainpage.

 

Simplest use:

1) Prepare a yourtext.txt file where each line contains a paragraph.

2) Edit rebus.py and replace 'ulysses' with 'yourtext'.

3) Put rebus.py next to the yourtext.txt file, and run it.

4) Collect your PDF/EPS/PNG dendrograms from the ./yourtext directory.

 

Valid characters:

If your words may contain numbers and underscores,

— you need to replace this line:
if not (lines[i][j][j2].isalpha() or lines[i][j][j2]==u"'" or lines[i][j][j2]==u"-"):

— with this line:
if not (lines[i][j][j2].isalnum() or lines[i][j][j2]==u"'" or lines[i][j][j2]==u"-" or lines[i][j][j2]==u"_"):

 

Case sensitivity:

If your words are case-sensitive,

— you need to remove this line:

lines[i]=lines[i].lower()

 

Batch run:

You can also run batch by putting several filenames:

text_names = [ 'yourtext1', 'yourtext2', 'yourtext3' ]

 

Paragraph separation:

If you set merge_lines = 1 then your paragraphs will be separated by double newlines, instead of single newlines.

 

Projection size ranges:

These two lines determine the ranges of projection sizes for each of the generated dendrograms:

min_proj_sizes = [10, 11, 12, 15, 20, 30, 40, 60, 150]
max_proj_sizes = [10, 11, 13, 17, 25, 39, 59, 149, inf]

If you want a single dendrogram for all words, you can set:

min_proj_sizes = [1]
max_proj_sizes = [inf]

If you want a single dendrogram for the words that occur in at least a, at most b paragraphs, you can set:

min_proj_sizes = [a]
max_proj_sizes = [b]

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s