2.4. XML annotation-based selection using a regex¶
Another common way of exploiting annotations consists in using them to select the segments that will be in-/excluded by an instance of Select (see Filtering segmentations using regexes) or Intersect (see Using a segmentation to filter another). Thus, in the case of the XML data example introduced here (and further developed there), we might insert an instance of Select between those of Extract XML and Count (see figure 1 below) in order to include only “content words”.

Figure 1: Inserting an instance of Select to filter a segmentation.¶
In this simplified example, the Select
instance could thus be parameterized (as indicated on figure 2
below), so as to exclude (Mode: Exclude) those segments whose
annotation value for key type (Annotation key: type) is DET or
PREP (Regex: ^(DET|PREP)$
).

Figure 2: Excluding segments based on annotation values with Select.¶