Selkie
  • To Do
  • Introduction
  • Python extensions
  • Web Application
  • General-purpose tools
  • Corpus processing
    • selkie.rom — Romanizations
    • selkie.xml — XML files
    • selkie.tok — Tokenizer for Latin scripts
    • selkie.pretts — Converting text to bare words
    • Conc - move back conc.rst.safe
    • selkie.corpus — Simple text corpus
  • The NLP pipeline
  • Speech
  • Datasets
  • Web Application Framework
  • Persistent-object database
  • The Content Framework
  • The CLD application
  • Universal Corpus
Selkie
  • Corpus processing
  • View page source

Corpus processing

  • selkie.rom — Romanizations
  • selkie.xml — XML files
  • selkie.tok — Tokenizer for Latin scripts
  • selkie.pretts — Converting text to bare words
  • Conc - move back conc.rst.safe
  • selkie.corpus — Simple text corpus
    • Drill
Previous Next

© Copyright 2022, Steven Abney.

Built with Sphinx using a theme provided by Read the Docs.