.. meta:: :description: Orange Textable documentation, segmenting with regexes :keywords: Orange, Textable, documentation, segment, regex Segmenting with regexes ======================= This tutorial has already shown how to use the :doc:`Segment ` widget to segment text into words, letters, or lines thanks to the drop-down menu options. .. _segmenting_with_regexes_fig1: .. figure:: figures/segmenting_with_regexes_fig1.png :align: center :alt: Interface of widget Segment configured with regex "\\w+" :scale: 80 % Figure 1: Interface of the :doc:`Segment ` widget, configured for word segmentation As a matter of fact, these options in the interface of :doc:`Segment ` rely internally on the use of regular expressions. For instance, the **Segment into words** option uses regex ``\w+``. It divides each incoming segment into sequences of alphanumeric characters (and underscores)–which in our case amounts to segmenting *a simple example* into three words. Similarly, regex ``\w`` is used to obtain a segmentation into letters (or to be precise, alphanumeric characters or underscores). With some knowledge of regular expressions, you can exploit the **Use a regular expression** option in the drop-down menu to do more specific queries. If the relevant unit is the word, regexes will often use the ``\b`` *anchor*, which represents a word boundary. For instance, words that contain less than 4 characters can be retrieved with ``\b\w{1,3}\b``, those ending in *-tion* with ``\b\w+tion\b``, and the inflected forms of *retrieve* with ``\bretriev(e|es|ed|ing)\b``. .. _segmenting_with_regexes_fig2: .. figure:: figures/segmenting_with_regexes_fig2.png :align: center :alt: Interface of widget Segment configured with regex "\\b\\w{1,3}\\b" :scale: 80 % Figure 2: Using a Regular Expression (``\b\w{1,3}\b``) with the :doc:`Segment ` widget In these examples, the same result can be achieved by first using the built-in **Segment into words** option and filtering the result with the :doc:`Select