Count transition frequency between adjacent units¶
Goal¶
Count the frequency of transitions between adjacent segment types in a text.
Prerequisites¶
Some text has been imported in Orange Textable (see Cookbook: Text input) and it has been segmented in smaller units (see Cookbook: Segment text in smaller units).
Ingredients¶
Widget
Icon
Quantity
1
Procedure¶

Figure 1: Count transition frequency with an instance of Count¶
Create an instance of Count.
Drag and drop from the output (righthand side) of the widget that has been used to segment the text, here Segment (letters), to the input of Count (lefthand side).
Double-click on the icon of Count to open its interface.
In the Units section, select the segmentation in which transitions between units will be counted (here letters).
In the Context section, choose Mode: Left-right neighborhood.
Select Left context size: 1 and Right context size: 0.
Click the Send button or tick the Send automatically checkbox.
A table showing the results is then available at the output of Count; to display or export it, see Cookbook: Table output.
Comment¶
It is also possible to define units as segment pairs (bigrams), triples (trigrams), and so on, by increasing the Sequence length parameter in the Units section.
If Sequence length is set to a value greater than 1, the string appearing in the Intra-sequence delimiter field will be inserted between the elements composing each n-gram in the column headers, which can enhance their readability. The default is
#
but you can change it to the delimiter of your choice.Furthermore, it is possible to count the apparition of units in more complex contexts than simply the previous unit, such as: the n previous units (Left context size); the n following units (Right context size); or any combination of both.
The Unit position marker is a string that indicates the separation between left and right contexts sides. The default is
_
but you can change it to the marker of your choice.