Import text from internet location

Goal

Import text content located at one or more URLs for further processing with Orange Textable.

Ingredients

Widget URLs
Icon urls_icon
Quantity 1

Procedure

Single URL

Importing text from an internet location using the URLs widget

Figure 1: Importing text from an internet location using the URLs widget.

  1. Create an instance of URLs on the canvas.
  2. Open its interface by double-clicking on the created instance.
  3. Make sure the Advanced settings checkbox is not selected.
  4. In the URL field, type the URL whose content you want to import (including the http:// prefix).
  5. In the Encoding drop-down menu, select the encoding that corresponds to this URL.
  6. Click the Send button (or make sure the Send automatically checkbox is selected).
  7. A segmentation covering the URL’s content is then available on the URLs instance’s output connections; to display or export it, see Cookbook: Text output.

Multiple URLs

Importing text from several internet locations using the URLs widget

Figure 2: Importing text from several internet locations using the URLs widget.

  1. Create an instance of URLs on the canvas.
  2. Open its interface by double-clicking on the created instance.
  3. Make sure the Advanced settings checkbox is selected.
  4. If needed, empty the list of imported URLs by clicking the Clear all button.
  5. In the URL(s) field, enter the URLs you want to import (including the http:// prefix), separated by the string ” / ” (space + slash + space); make sure they all have the same encoding (you will be able to add URLs that have other encodings later).
  6. In the Encoding drop-down menu, select the encoding that corresponds to the set of selected URLs.
  7. Click the Add button to add the set of selected URLs to the list of imported URLs.
  8. Repeat steps 5 to 7 for adding URLs in other encoding(s).
  9. Click the Send button (or make sure the Send automatically checkbox is selected).
  10. A segmentation containing a segment covering each imported URL’s content is then available on the URLs instance’s output connections; to display or export it, see Cookbook: Text output.