Module eyekit.text

Defines the TextBlock and InterestArea objects for handling texts.

Classes

class Box

Representation of a bounding box, which provides an underlying framework for Character, InterestArea, and TextBlock.

Subclasses

Instance variables

var x : float

X-coordinate of the center of the bounding box

var y : float

Y-coordinate of the center of the bounding box

var x_tl : float

X-coordinate of the top-left corner of the bounding box

var y_tl : float

Y-coordinate of the top-left corner of the bounding box

var x_br : float

X-coordinate of the bottom-right corner of the bounding box

var y_br : float

Y-coordinate of the bottom-right corner of the bounding box

var width : float

Width of the bounding box

var height : float

Height of the bounding box

var box : tuple

The bounding box represented as x_tl, y_tl, width, and height

var center : tuple

XY-coordinates of the center of the bounding box

class Character (char, x_tl, y_tl, x_br, y_br, baseline)

Representation of a single character of text. A Character object is essentially a one-letter string that occupies a position in space and has a bounding box. It is not usually necessary to create Character objects manually; they are created automatically during the instantiation of a TextBlock.

Ancestors

Instance variables

var baseline : float

The y position of the character baseline

var midline : float

The y position of the character midline

Inherited members

class InterestArea (chars, id, padding=0)

Representation of an interest area – a portion of a TextBlock object that is of potential interest. It is not usually necessary to create InterestArea objects manually; they are created automatically when you slice a TextBlock object or when you iterate over lines, words, characters, ngrams, or zones parsed from the raw text.

Ancestors

Instance variables

var id : str

Interest area ID. By default, these ID's have the form 1:5:10, which represents the line number and column indices of the InterestArea in its parent TextBlock. However, IDs can also be changed to any arbitrary string.

var text : str

String representation of the interest area

var baseline : float

The y position of the text baseline

var midline : float

The y position of the text midline

Inherited members

class TextBlock (text: list, position: tuple = None, font_face: str = None, font_size: float = None, line_height: float = None, align: str = None, anchor: str = None, alphabet: str = None)

Representation of a piece of text, which may be a word, sentence, or entire multiline passage.

Initialized with:

  • text The line of text (string) or lines of text (list of strings). Optionally, these can be marked up with arbitrary interest areas (zones); for example, The quick brown fox jump[ed]{past-suffix} over the lazy dog.

  • position XY-coordinates describing the position of the TextBlock on the screen. The x-coordinate should be either the left edge, right edge, or center point of the TextBlock, depending on how the anchor argument has been set (see below). The y-coordinate should always correspond to the baseline of the first (or only) line of text.

  • font_face Name of a font available on your system. The keywords italic and/or bold can also be included to select the desired style, e.g., Minion Pro bold italic.

  • font_size Font size in pixels. At 72dpi, this is equivalent to the font size in points. To convert a font size from some other dpi, use font_size_at_72dpi().

  • line_height Distance between lines of text in pixels. In general, for single line spacing, the line height is equal to the font size; for double line spacing, the line height is equal to 2 × the font size, etc. By default, the line height is assumed to be the same as the font size (single line spacing). This parameter also effectively determines the height of the bounding boxes around interest areas.

  • align Alignment of the text within the TextBlock. Must be set to left, center, or right, and defaults to left.

  • anchor Anchor point of the TextBlock. This determines the interpretation of the position argument (see above). Must be set to left, center, or right, and defaults to the same as the align argument. For example, if position was set to the center of the screen, the align and anchor arguments would have the following effects:

  • alphabet A string of characters that are to be considered alphabetical, which determines what counts as a word. By default, this includes any character defined as a letter or number in unicode, plus the underscore character. However, if you need to modify Eyekit's default behavior, you can set a specific alphabet here. For example, if you wanted to treat apostrophes and hyphens as alphabetical, you might use alphabet="A-Za-z'-". This would allow a sentence like "Where's the orang-utan?" to be treated as three words rather than five.

Ancestors

Static methods

def defaults(position: tuple = None, font_face: str = None, font_size: float = None, line_height: float = None, align: str = None, anchor: str = None, alphabet: str = None)

Set default TextBlock parameters. If you plan to create several TextBlocks with the same parameters, it may be useful to set the default parameters at the top of your script or at the start of your session:

import eyekit
eyekit.TextBlock.defaults(font_face='Helvetica')
txt = eyekit.TextBlock('The quick brown fox')
print(txt.font_face) # 'Helvetica'

Instance variables

var text : str

String representation of the text

var position : tuple

Position of the TextBlock

var font_face : str

Name of the font

var font_size : float

Font size in points

var line_height : float

Line height in points

var align : str

Alignment of the text (either left, center, or right)

var anchor : str

Anchor point of the text (either left, center, or right)

var alphabet : str

Characters that are considered alphabetical

var n_rows : int

Number of rows in the text (i.e. the number of lines)

var n_cols : int

Number of columns in the text (i.e. the number of characters in the widest line)

var baselines : list

Y-coordinate of the baseline of each line of text

var midlines : list

Y-coordinate of the midline of each line of text

Methods

def zones(self)

Iterate over each marked up zone as an InterestArea.

def which_zone(self, fixation)

Return the marked-up zone that the fixation falls inside as an InterestArea.

def lines(self)

Iterate over each line as an InterestArea.

def which_line(self, fixation)

Return the line that the fixation falls inside as an InterestArea.

def words(self, pattern=None, line_n=None, add_padding=True)

Iterate over each word as an InterestArea. Optionally, you can supply a regex pattern to define what constitutes a word or to pick out specific words. For example, r'\b[Tt]he\b' gives you all occurrences of the word the or '[a-z]+ing' gives you all words ending with -ing. add_padding adds half of the width of a space character to the left and right edges of the word's bounding box, so that fixations that fall on a space between two words will at least fall into one of the two words' bounding boxes.

def which_word(self, fixation, pattern=None, line_n=None, add_padding=True)

Return the word that the fixation falls inside as an InterestArea. For the meaning of pattern and add_padding see TextBlock.words().

def characters(self, line_n=None, alphabetical_only=True)

Iterate over each character as an InterestArea.

def which_character(self, fixation, line_n=None, alphabetical_only=True)

Return the character that the fixation falls inside as an InterestArea.

def ngrams(self, n, line_n=None, alphabetical_only=True)

Iterate over each ngram, for given n, as an InterestArea.

def word_centers(self)

Return the XY-coordinates of the center of each word.

Inherited members