Class and Function Documentation

Table of Contents

Name Chatterlang Name Description
talkpipe.chatterlang.compiler
Accum accum Accumulates items from the input stream both in an internal buffer and in the specified variable.
talkpipe.data.email
sendEmail sendEmail Send emails for each item in the input iterable using SMTP.
talkpipe.data.extraction
FileExtractor extract A class for extracting text content from different file types.
readdocx readdocx Read and extract text from Microsoft Word (.docx) files.
readtxt readtxt Reads text files from given file paths and yields their contents.
talkpipe.data.html
downloadURLSegment downloadURL Download a URL segment and return its content.
htmlToTextSegment htmlToText Converts HTML content to text segment.
talkpipe.data.mongo
MongoInsert mongoInsert Insert items from the input stream into a MongoDB collection.
MongoSearch mongoSearch Search a MongoDB collection and yield results.
talkpipe.data.rss
rss_source rss Generator function that monitors and yields new entries from an RSS feed.
talkpipe.llm.chat
LlmExtractTerms llmExtractTerms For each piece of text read from the input stream, extract terms from the text.
LLMPrompt llmPrompt Interactive, optionally multi-turn, chat with an llm.
LlmScore llmScore For each piece of text read from the input stream, compute a score and an explanation for that score.
talkpipe.llm.embedding
LLMEmbed llmEmbed Read strings from the input stream and emit an embedding for each string using a language model.
talkpipe.operations.filtering
distinctBloomFilter distinctBloomFilter Filter items using a Bloom Filter to yield only distinct elements based on specified fields.
talkpipe.operations.matrices
ReduceTSNE reduceTSNE Use t-SNE to reduce dimensionality of provided matrix.
ReduceUMAP reduceUMAP Use UMAP to reduce dimensionality of provided matrix.
talkpipe.operations.threading
threadedSegment threaded Links the input stream to a threaded queue system.
talkpipe.operations.transforms
fill_null fillNull Fills null (None) values in a sequence of dictionaries with specified defaults.
MakeLists makeLists
regex_replace regexReplace Transform items by applying regex pattern replacement.
talkpipe.pipe.basic
appendAs appendAs Appends the specified fields to the input item.
call_func call_func Call a function on each item in the input stream.
Cast cast Casts the input data to a specified type.
concat concat Concatenates specified fields from each item with a delimiter.
ConfigureLogger configureLogger Configures loggers based on the provided logger levels and files.
DescribeData describe Returns a dictionary of all attributes of the input data.
everyN everyN Yields every nth item from the input stream.
exec exec Execute a command and yields each line passed to stdout as an item.
fillTemplate fillTemplate Fill a template string with values from the input item.
First firstN Passes on the first N items from the input stream.
flatten flatten Flattens a nested list of items.
Hash hash Hashes the input data using the specified algorithm.
isIn isIn Filters items based on whether a field contains a specified value.
isNotIn isNotIn Filters items based on whether a field does not contain a specified value.
slice slice Slices a sequence using start and end indices.
ToDataFrame toDataFrame Drain all items from the input stream and emit a single DataFrame.
ToDict toDict Creates a dictionary from the input data.
ToList toList Drains the input stream and emits a list of all items.
talkpipe.pipe.io
dumpsJsonl dumpsJsonl Drains the input stream and dumps each item as a jsonl string.
echo echo A source that generates input from a string.
loadsJsonl loadsJsonl Reads each item from the input stream, interpreting it as a jsonl string.
Log log An operation that logs each item from the input stream.
Print print An operation prints and passes on each item from the input stream.
Prompt prompt A source that generates input from a prompt.
readJsonl readJsonl Reads each item from the input stream as a path to a jsonl file. Loads each line of
writePickle writePickle Drains the input stream into a list and then writes the list as a pickle file.
talkpipe.pipe.math
arange range Generate a range of integers between lower and upper
gt gt Filter items where a specified field's value is greater than a number.
lt lt Filters items based on a field value being less than a specified number.
randomInts randomInts Generate n random integers between lower and upper.
scale scale Scale each item in the input stream by the multiplier.
talkpipe.chatterlang.compiler

Segment Class: Accum

Chatterlang Name: accum

Accumulates items from the input stream both in an internal buffer and in the specified variable.  
This is useful for accumulating the results of running the pipeline multiple times.     

Args:
    variable (Union[VariableName, str], optional): The name of the variable to store the accumulated data in. Defaults to None.
    reset (bool, optional): Whether to reset the accumulator each time the segment is run. Defaults to True.

Parameters:

Base Classes: io.AbstractSegment

talkpipe.data.email

Segment Function: sendEmail

Chatterlang Name: sendEmail

Send emails for each item in the input iterable using SMTP.

This function processes a list of items and sends an email for each one, using the specified
fields for subject and body content. It supports both HTML and plain text email formats.

Args:
    subject_field (str): Field name in the item to use as email subject
    body_fields (list[str]): List of field names to include in email body
    sender_email (str, optional): Sender's email address. If None, uses config value
    recipient_email (str, optional): Recipient's email address. If None, uses config value
    smtp_server (str, optional): SMTP server address. Defaults to 'smtp.gmail.com'
    port (int, optional): SMTP server port. Defaults to 587

Yields:
    item: Returns each processed item after sending its corresponding email

Raises:
    AssertionError: If subject_field or body_fields are None
    ValueError: If required fields are missing in items

Example:
    >>> items = [{'title': 'Hello', 'content': 'World'}]
    >>> for item in sendEmail(items, 'title', ['content'], 'sender@email.com', 'recipient@email.com'):
    ...     print(f"Processed {item}")

Notes:
    - Requires valid SMTP credentials in config
    - Supports HTML formatting in email body
    - Uses TLS encryption for email transmission

Parameters:

talkpipe.data.extraction

Segment Class: FileExtractor

Chatterlang Name: extract

A class for extracting text content from different file types.

This class implements the AbstractSegment interface and provides functionality to extract
text content from various file formats using registered extractors. It supports multiple
file formats and can be extended with additional extractors.

Attributes:
    _extractors (dict): A dictionary mapping file extensions to their corresponding extractor functions.

Methods:
    register_extractor(file_extension: str, extractor): Register a new file extractor for a specific extension.
    extract(file_path: Union[str, PosixPath]): Extract content from a single file.
    transform(input_iter): Transform an iterator of file paths into an iterator of their contents.

Example:
    >>> extractor = FileExtractor()
    >>> content = extractor.extract("document.txt")
    >>> for text in extractor.transform(["file1.txt", "file2.docx"]):
    ...     print(text)

Raises:
    Exception: When trying to extract content from a file with an unsupported extension.

Parameters:

Base Classes: AbstractSegment

Segment Function: readdocx

Chatterlang Name: readdocx

Read and extract text from Microsoft Word (.docx) files.

This function takes an iterable of file paths to .docx documents and yields the
extracted text content from each document, with paragraphs joined by spaces.

Yields:
    str: The full text content of each document with paragraphs joined by spaces

Raises:
    Exception: If there are issues reading the .docx files

Example:
    >>> paths = ['doc1.docx', 'doc2.docx']
    >>> for text in readdocx(paths):
    ...     print(text)

Segment Function: readtxt

Chatterlang Name: readtxt

Reads text files from given file paths and yields their contents.

Args:
    file_paths (Iterable[str]): An iterable containing paths to text files to be read.

Yields:
    str: The contents of each text file.

Raises:
    FileNotFoundError: If a file path does not exist.
    IOError: If there is an error reading any of the files.

Example:
    >>> files = ['file1.txt', 'file2.txt']
    >>> for content in readtxt(files):
    ...     print(content)
talkpipe.data.html

Segment Function: downloadURLSegment

Chatterlang Name: downloadURL

Download a URL segment and return its content.

This function is a wrapper around downloadURL that specifically handles URL segments.
It attempts to download content from the specified URL with configurable error handling
and timeout settings.

Args:
    fail_on_error (bool, optional): If True, raises exceptions on download errors.
        If False, returns None on errors. Defaults to True.
    timeout (int, optional): The timeout in seconds for the download request. 
        Defaults to 10 seconds.

Returns:
    bytes|None: The downloaded content as bytes if successful, None if fail_on_error
        is False and an error occurs.

Raises:
    Various exceptions from downloadURL function when fail_on_error is True and
    an error occurs during download.

Parameters:

Segment Function: htmlToTextSegment

Chatterlang Name: htmlToText

Converts HTML content to text segment.

This function takes HTML content and converts it to plain text format.
If cleanText is enabled, the resulting text will also be cleaned so it 
tries to retain only the main body content.

Args:
    raw (str): The raw HTML content to be converted
    cleanText (bool, optional): Whether to clean and normalize the output text. Defaults to True.

Returns:
    str: The extracted text content from the HTML

See Also:
    htmlToText: The underlying function used for HTML to text conversion

Parameters:

talkpipe.data.mongo

Segment Class: MongoInsert

Chatterlang Name: mongoInsert

Insert items from the input stream into a MongoDB collection.

For each item received, this segment inserts it into the specified MongoDB collection
and then yields the item back to the pipeline. This allows for both persisting data
and continuing to process it in subsequent pipeline stages.

Args:
    connection_string (str, optional): MongoDB connection string. If not provided,
        will attempt to get from config using the key "mongo_connection_string".
    database (str): Name of the MongoDB database to use.
    collection (str): Name of the MongoDB collection to use.
    field (str, optional): Field to extract from each item for insertion. 
        If not provided, inserts the entire item. Default is "_".
    fields (str, optional): Comma-separated list of fields to extract and include in the 
        document, in the format "field1:name1,field2:name2". If provided, this creates a 
        new document with the specified fields. Cannot be used with 'field' parameter.
    append_as (str, optional): If provided, adds the MongoDB insertion result
        to the item using this field name. Default is None.
    create_index (str, optional): If provided, creates an index on this field.
        Default is None.
    unique_index (bool, optional): If True and create_index is provided, 
        creates a unique index. Default is False.

Parameters:

Base Classes: core.AbstractSegment

Segment Class: MongoSearch

Chatterlang Name: mongoSearch

Search a MongoDB collection and yield results.

This segment performs a query against a MongoDB collection and yields
the matching documents one by one as they are returned from the database.

Args:
    field(str): the field in the incoming item to use as a query.  Defaults is "_"
    connection_string (str, optional): MongoDB connection string. If not provided,
        will attempt to get from config using the key "mongo_connection_string".
    database (str): Name of the MongoDB database to use.
    collection (str): Name of the MongoDB collection to use.
    project (str, optional): JSON string defining the projection for returned documents.
        Default is None (returns all fields).
    sort (str, optional): JSON string defining the sort order. Default is None.
    limit (int, optional): Maximum number of results to return per query. Default is 0 (no limit).
    skip (int, optional): Number of documents to skip. Default is 0.
    append_as (str, optional): If provided, adds the MongoDB results to the incoming item
        using this field name. If not provided, the results themselves are yielded.
    as_list (bool, optional): If True and append_as is provided, all results are collected
        into a list and appended to the incoming item. Default is False.

Parameters:

Base Classes: core.AbstractSegment

talkpipe.data.rss

Source Function: rss_source

Chatterlang Name: rss

Generator function that monitors and yields new entries from an RSS feed.

This function continuously monitors an RSS feed at the specified URL and yields new entries
as they become available. It uses a SQLite database to keep track of previously seen entries
to avoid duplicates.

Args:
    url (str): The URL of the RSS feed to monitor.  If None, the URL is read from the config using
        the key "RSS_URL"
    db_path (str, optional): Path to the SQLite database file for storing entry history.
        Defaults to ':memory:' for an in-memory database.
    poll_interval_minutes (int, optional): Number of minutes to wait between polling
        the RSS feed for updates. Defaults to 10 minutes.

Yields:
    dict: New entries from the RSS feed, containing feed item data.

Example:
    >>> for entry in rss_source("http://example.com/feed.xml"):
    ...     print(entry["title"])

Parameters:

talkpipe.llm.chat

Segment Class: LlmExtractTerms

Chatterlang Name: llmExtractTerms

For each piece of text read from the input stream, extract terms from the text.

The system prompt must be provided and should explain the nature of the terms. For 
example, a system_prompt might be:

Extract keywords from the following text.
See the LLMPrompt segment for more information on the other arguments.

Base Classes: AbstractLLMGuidedGeneration

Segment Class: LLMPrompt

Chatterlang Name: llmPrompt

Interactive, optionally multi-turn, chat with an llm.

Reads prompts from the input stream and emits responses from the llm.
The model name and source can be specified in three different ways.  If
explicitly included in the constructor, those values will be used.  If not,
the values will be loaded from environment variables (TALKPIPE_default_model_name
and TALKPIPE_default_source).  If those are not set, the values will be loaded
from the configuration file (~/.talkpipe.toml).  If none of those are set, an 
error will be raised.

Args:
    name (str, optional): The name of the model to chat with. Defaults to None.
    source (ModelSource, optional): The source of the model. Defaults to None. Valid values are "openai" and "ollama."
    system_prompt (str, optional): The system prompt for the model. Defaults to "You are a helpful assistant.".
    multi_turn (bool, optional): Whether the chat is multi-turn. Defaults to True.
    pass_prompts (bool, optional): Whether to pass the prompts through to the output. Defaults to False.
    field (str, optional): The field in the input item containing the prompt. Defaults to None.
    append_as (str, optional): The field to append the response to. Defaults to None.
    temperature (float, optional): The temperature to use for the model. Defaults to 0.5.
    output_format (BaseModel, optional): A class used for guided generation. Defaults to None.

Parameters:

Base Classes: AbstractSegment

Segment Class: LlmScore

Chatterlang Name: llmScore

For each piece of text read from the input stream, compute a score and an explanation for that score.

The system prompt must be provided and should explain the range of the score (which must be 
a range of integers) and the meaning of the score. For example, a system_prompt might be:

Score the following text according to how relevant it is to canines, where 0 mean unrelated and 10 
means highly related.
See the LLMPrompt segment for more information on the other arguments.

Base Classes: AbstractLLMGuidedGeneration

talkpipe.llm.embedding

Segment Class: LLMEmbed

Chatterlang Name: llmEmbed

Read strings from the input stream and emit an embedding for each string using a language model.

This segment creates vector embeddings from text using the specified embedding model.
It can extract text from a specific field in structured data or process the input directly.

Attributes:
    embedder: The embedding adapter instance that performs the actual embedding.
    field: Optional field name to extract text from structured input.
    append_as: Optional field name to append embeddings to the original item.

Parameters:

Base Classes: AbstractSegment

talkpipe.operations.filtering

Segment Function: distinctBloomFilter

Chatterlang Name: distinctBloomFilter

Filter items using a Bloom Filter to yield only distinct elements based on specified fields.

A Bloom Filter is a space-efficient probabilistic data structure used to test whether 
an element is a member of a set. False positive matches are possible, but false 
negatives are not.

Args:
    items (iterable): Input items to filter.
    capacity (int): Expected number of items to be added to the Bloom Filter.
    error_rate (float): Acceptable false positive probability (between 0 and 1).
    field_list (str, optional): Dot-separated string of nested fields to use for 
        distinctness check. Defaults to "_" which uses the entire item.

Yields:
    item: Items that have not been seen before according to the Bloom Filter.

Example:
    >>> items = [{"id": 1, "name": "John"}, {"id": 2, "name": "John"}]
    >>> list(distinctBloomFilter(items, 1000, 0.01, "name"))
    [{'id': 1, 'name': 'John'}]  # Only first item with name "John" is yielded

Note:
    Due to the probabilistic nature of Bloom Filters, there is a small chance
    of false positives (items incorrectly identified as duplicates) based on
    the specified error_rate.

Parameters:

talkpipe.operations.matrices

Segment Class: ReduceTSNE

Chatterlang Name: reduceTSNE

Use t-SNE to reduce dimensionality of provided matrix.

This segment reduces the dimensionality of the provided matrix using t-SNE 
(t-Distributed Stochastic Neighbor Embedding).

Parameters:
    n_components: The dimension of the space to embed into. Default is 2.
    perplexity: The perplexity is related to the number of nearest neighbors used
        in other manifold learning algorithms. Larger datasets usually require a
        larger perplexity. Default is 30.
    early_exaggeration: Controls how tight natural clusters in the original 
        space are in the embedded space. Default is 12.0.
    learning_rate: The learning rate for t-SNE. Default is 200.0.
    n_iter: Maximum number of iterations for the optimization. Default is 1000.
    metric: Distance metric for t-SNE. Default is 'euclidean'.
    random_state: Random state for reproducibility.
    **tsne_kwargs: Additional keyword arguments to pass to TSNE.

Parameters:

Base Classes: AbstractSegment

Segment Class: ReduceUMAP

Chatterlang Name: reduceUMAP

Use UMAP to reduce dimensionality of provided matrix.

This segment reduces the dimensionality of the provided matrix using UMAP.

Parameters:
    n_components: The dimension of the space to embed into. Default is 2.
    n_neighbors: Size of local neighborhood. Default is 15.
    min_dist: Minimum distance between embedded points. Default is 0.1.
    metric: Distance metric for UMAP. Default is 'euclidean'.
    random_state: Random state for reproducibility.
    **umap_kwargs: Additional keyword arguments to pass to UMAP.

Parameters:

Base Classes: AbstractSegment

talkpipe.operations.threading

Segment Function: threadedSegment

Chatterlang Name: threaded

Links the input stream to a threaded queue system.

This segment takes an input stream and links it to a threaded queue system.
It starts the queue system and then starts yielding from the queue.  That way
the upstream units don't have to wait for the downstream segments to draw 
from them.
talkpipe.operations.transforms

Segment Function: fill_null

Chatterlang Name: fillNull

Fills null (None) values in a sequence of dictionaries with specified defaults.

This generator function processes dictionaries by replacing None values with either
a general default value or specific values for named fields.

Args:
    items: An iterable of dictionaries to process.
    default (str, optional): The default value to use for any None values not 
        specified in kwargs. Defaults to ''.
    **kwargs: Field-specific default values. Each keyword argument specifies a
        field name and the default value to use for that field.

Yields:
    dict: The processed dictionary with None values replaced by defaults.

Raises:
    AssertionError: If any item in the input is not a dictionary.
    TypeError: If any item doesn't support item assignment using square brackets.

Examples:
    >>> data = [{'a': None, 'b': 1}, {'a': 2, 'b': None}]
    >>> list(fill_null(data, default='N/A'))
    [{'a': 'N/A', 'b': 1}, {'a': 2, 'b': 'N/A'}]
    
    >>> list(fill_null(data, b='EMPTY'))
    [{'a': None, 'b': 1}, {'a': 2, 'b': 'EMPTY'}]

Parameters:

Segment Class: MakeLists

Chatterlang Name: makeLists

Parameters:

Base Classes: AbstractSegment

Segment Function: regex_replace

Chatterlang Name: regexReplace

Transform items by applying regex pattern replacement.

This segment transforms items by applying a regex pattern replacement to either
the entire item (if field="_") or a specific field of the item.

Args:
    items (Iterable): Input items to transform.
    pattern (str): Regular expression pattern to match.
    replacement (str): Replacement string for matched patterns.
    field (str, optional): Field to apply transformation to. Use "_" for entire item. Defaults to "_".

Yields:
    Union[str, dict]: Transformed items. Returns string if field="_", otherwise returns modified item dict.

Raises:
    TypeError: If extracted value is not a string or if item is not subscriptable when field != "_".

Examples:
    >>> list(regex_replace(["hello world"], r"world", "everyone"))
    ['hello everyone']
    
    >>> list(regex_replace([{"text": "hello world"}], r"world", "everyone", field="text"))
    [{'text': 'hello everyone'}]

Parameters:

talkpipe.pipe.basic

Segment Function: appendAs

Chatterlang Name: appendAs

Appends the specified fields to the input item.

Equivalent to toDict except that that item is modified with the new key/value pairs 
rather than a new dictionary returned.

Assumes that the input item can has items assigned using bracket notation ([]).

Parameters:

Segment Function: call_func

Chatterlang Name: call_func

Call a function on each item in the input stream.

Args:
    func (callable): The function to call on each item

Parameters:

Segment Class: Cast

Chatterlang Name: cast

Casts the input data to a specified type.

The type can be specified by passing a type object or a string representation of the type.
The cast will optionally fail silently if the data cannot be cast to the specified type.
This lets this segment also be used as a filter to remove data that cannot be cast.
The cast occurs by calling the type object on the data.  

Parameters:

Base Classes: AbstractSegment

Segment Function: concat

Chatterlang Name: concat

Concatenates specified fields from each item with a delimiter.

    Args:
        items: Iterable of input items to process
        fields: String specifying fields to extract and concatenate
        delimiter (str, optional): String to insert between concatenated fields. Defaults to "

"
        append_as (str, optional): If specified, adds concatenated result as new field with this name. 
                                Defaults to None.

    Yields:
        If append_as is specified, yields the original item with concatenated result added as new field.
        Otherwise, yields just the concatenated string.
    

Parameters:

Segment Class: ConfigureLogger

Chatterlang Name: configureLogger

Configures loggers based on the provided logger levels and files.

This segment configures loggers based on the provided logger levels and files.
The logger levels are specified as a string in the format "logger:level,logger:level,...".
The logger files are specified as a string in the format "logger:file,logger:file,...".

It configures when the script is compiled or the object is instantiated and never again 
after that.  It passes the input data through unchanged.

Args:
    logger_levels (str): Logger levels in format 'logger:level,logger:level,...'
    logger_files (str): Logger files in format 'logger:file,logger:file,...'

Parameters:

Base Classes: AbstractSegment

Segment Class: DescribeData

Chatterlang Name: describe

Returns a dictionary of all attributes of the input data.

This is useful mostly for debugging and understanding the 
structure of the data.

Parameters:

Base Classes: AbstractSegment

Segment Function: everyN

Chatterlang Name: everyN

Yields every nth item from the input stream.

Args:
    items: Iterable of items to process
    n: Number of items to skip between each yield

Yields:
    Every nth item from the input stream.

Parameters:

Source Function: exec

Chatterlang Name: exec

Execute a command and yields each line passed to stdout as an item.

Parameters:

Segment Function: fillTemplate

Chatterlang Name: fillTemplate

Fill a template string with values from the input item.

Args:
    item: The input item containing values to fill the template
    template (str): The template string with placeholders for values

Returns:
    str: The filled template string

Parameters:

Segment Class: First

Chatterlang Name: firstN

Passes on the first N items from the input stream.

Args:
    n (int): The number of items to pass on. Default is 1.

Parameters:

Segment Function: flatten

Chatterlang Name: flatten

Flattens a nested list of items.

Args:
    items: Iterable of items to flatten

Yields:
    Flattened list of items

Segment Class: Hash

Chatterlang Name: hash

Hashes the input data using the specified algorithm.

This segment hashes the input data using the specified algorithm.
Strings will be encoded and hashed.  All other datatypes wil be hashed using either pickle or repr().

Args:
    algorithm (str): Hash algorithm to use.  Options include SHA1, SHA224, SHA256, SHA384, SHA512, SHA-3, and MD5.
    use_repr (bool): If True, the repr() version of the input data is hashed.  If False, the input data is hashed via 
        pickling.  Using repr() will handle all object, even those that can't be pickled and won't be subject to
        changes in pickling formats.  But the pickled version will include more state and generally be more reliable.

Parameters:

Base Classes: AbstractSegment

Segment Function: isIn

Chatterlang Name: isIn

Filters items based on whether a field contains a specified value.

Args:
    items: Iterable of items to filter
    field: Field name to check for value
    value: Value to check for in the field

Yields:
    Items where the specified field contains the specified value.

Parameters:

Segment Function: isNotIn

Chatterlang Name: isNotIn

Filters items based on whether a field does not contain a specified value.

Args:
    field: Field name to check for value
    value: Value to check for in the field

Yields:
    Items where the specified field does not contain the specified value.

Parameters:

Segment Function: slice

Chatterlang Name: slice

Slices a sequence using start and end indices.

This function takes a sequence and a range string in the format "start:end" to slice the sequence.
Both start and end indices are optional.

Args:
    item: Any sequence that supports slicing (e.g., list, string, tuple)
    range (str, optional): String in format "start:end" where both start and end are optional.
        For example: "2:5", ":3", "4:", ":" are all valid. Defaults to None.

Returns:
    The sliced sequence containing elements from start to end index.
    If range is None, returns a full copy of the sequence.

Examples:
    >>> slice([1,2,3,4,5], "1:3")
    [2, 3]
    >>> slice("hello", ":3")
    "hel"
    >>> slice([1,2,3,4,5], "2:")
    [3, 4, 5]

Parameters:

Segment Class: ToDataFrame

Chatterlang Name: toDataFrame

Drain all items from the input stream and emit a single DataFrame.

The input data stream should be composed of dictionaries, where each 
dictionary represents a row in the DataFrame.

Parameters:

Base Classes: AbstractSegment

Segment Class: ToDict

Chatterlang Name: toDict

Creates a dictionary from the input data.

Parameters:

Base Classes: AbstractSegment

Segment Class: ToList

Chatterlang Name: toList

Drains the input stream and emits a list of all items.

Parameters:

Base Classes: AbstractSegment

talkpipe.pipe.io

Segment Function: dumpsJsonl

Chatterlang Name: dumpsJsonl

Drains the input stream and dumps each item as a jsonl string.
    

Source Function: echo

Chatterlang Name: echo

A source that generates input from a string.

This source will generate input from a string, splitting it on a delimiter.

Parameters:

Segment Function: loadsJsonl

Chatterlang Name: loadsJsonl

Reads each item from the input stream, interpreting it as a jsonl string. 
    
    

Segment Class: Log

Chatterlang Name: log

An operation that logs each item from the input stream.

Parameters:

Base Classes: AbstractSegment

Segment Class: Print

Chatterlang Name: print

An operation prints and passes on each item from the input stream.

Parameters:

Base Classes: AbstractSegment

Source Class: Prompt

Chatterlang Name: prompt

A source that generates input from a prompt.

This source will generate input from a prompt until the user enters an EOF.
It is for creating interactive pipelines.  It uses prompt_toolkit under the
hood to provide a nice prompt experience.

Parameters:

Base Classes: AbstractSource

Segment Function: readJsonl

Chatterlang Name: readJsonl

Reads each item from the input stream as a path to a jsonl file. Loads each line of
each file as a json object and yields each individually.

Segment Function: writePickle

Chatterlang Name: writePickle

Drains the input stream into a list and then writes the list as a pickle file.

Args:
    fname (str): The name of the file to write.
    first_only (bool): If True, the segment will write only the first item in the input stream,
        throwing an exception if there is more than one.
        If False, the segment will write the entire input stream.

Parameters:

talkpipe.pipe.math

Source Function: arange

Chatterlang Name: range

Generate a range of integers between lower and upper

Parameters:

Segment Function: gt

Chatterlang Name: gt

Filter items where a specified field's value is greater than a number.

Takes an iterable of items and yields only those where the value of the specified field
is greater than the given number n.

Args:
    items: Iterable of items to filter
    field: String representing the field/property to compare
    n: Number to compare against

Yields:
    Items where the specified field's value is greater than n

Raises:
    AttributeError: If the specified field is missing from any item

Example:
    >>> list(gt([{'value': 5}, {'value': 2}, {'value': 8}], 'value', 4))
    [{'value': 5}, {'value': 8}]

Parameters:

Segment Function: lt

Chatterlang Name: lt

Filters items based on a field value being less than a specified number.

Yields items where the specified field value is less than the given number n.

Args:
    items (iterable): An iterable of items to filter
    field (str): The field/property name to check against
    n (numeric): The number to compare against

Yields:
    item: Items where the specified field value is less than n

Raises:
    AttributeError: If the specified field does not exist on an item (due to fail_on_missing=True)

Example:
    >>> items = [{'val': 1}, {'val': 5}, {'val': 3}]
    >>> list(lt(items, 'val', 4))
    [{'val': 1}, {'val': 3}]

Parameters:

Source Function: randomInts

Chatterlang Name: randomInts

Generate n random integers between lower and upper.

Parameters:

Segment Function: scale

Chatterlang Name: scale

Scale each item in the input stream by the multiplier.

Parameters: