intelmq.bots.parsers.twitter package¶
Submodules¶
intelmq.bots.parsers.twitter.parser module¶
Parser of text intended to obtain IOCs from tweets. First substitutions are performed and then words in the text are compared with ‘(/|^)([a-z0-9.-]+.[a-z0-9]+?)([/:]|$)’ In the case of a match it is checked whether this can be a valid domain using get_tld There is also a whitelist for filtering out good domains.
- param domain_whitelist
domains that will be ignored in parsing
- param substitutions
semicolon separated list of pairs substitutions that will be made in the text, for example ” .com,.com” enables parsing of one fuzzy format “[.];.” enables the parsing of another fuzzy format
- param classification_type
string with a valid classificationtype
- intelmq.bots.parsers.twitter.parser.BOT¶
alias of
intelmq.bots.parsers.twitter.parser.TwitterParserBot
- class intelmq.bots.parsers.twitter.parser.TwitterParserBot(bot_id: str, start: bool = False, sighup_event=None, disable_multithreading: Optional[bool] = None)¶
Bases:
intelmq.lib.bot.ParserBot
Parse tweets and extract IoC data. Currently only URLs are supported, a whitelist of safe domains can be provided
- classification_type: str = 'blacklist'¶
- default_scheme: Optional[str] = None¶
- domain_whitelist: str = 't.co'¶
- get_data_from_text(text) list ¶
- get_domain(address)¶
- in_whitelist(domain: str) bool ¶
- init()¶
- process()¶
- substitutions: str = '.net;[.]net'¶