Harmonization field names

Harmonization types

ASN

ASN type. Derived from Integer with forbidden values.

Only valid are: 0 < asn <= 4294967295 See https://en.wikipedia.org/wiki/Autonomous_system_(Internet) > The first and last ASNs of the original 16-bit integers, namely 0 and > 65,535, and the last ASN of the 32-bit numbers, namely 4,294,967,295 are > reserved and should not be used by operators.

Accuracy

Accuracy type. A Float between 0 and 100.

Base64

Base64 type. Always gives unicode strings.

Sanitation encodes to base64 and accepts binary and unicode strings.

Boolean

Boolean type. Without sanitation only python bool is accepted.

Sanitation accepts string ‘true’ and ‘false’ and integers 0 and 1.

ClassificationTaxonomy

classification.taxonomy type.

The mapping follows Reference Security Incident Taxonomy Working Group – RSIT WG https://github.com/enisaeu/Reference-Security-Incident-Taxonomy-Task-Force/

These old values are automatically mapped to the new ones:

‘abusive content’ -> ‘abusive-content’ ‘information gathering’ -> ‘information-gathering’ ‘intrusion attempts’ -> ‘intrusion-attempts’ ‘malicious code’ -> ‘malicious-code’

Allowed values are:
  • abusive-content

  • availability

  • fraud

  • information-content-security

  • information-gathering

  • intrusion-attempts

  • intrusions

  • malicious-code

  • other

  • test

  • vulnerable

ClassificationType

classification.type type.

The mapping follows Reference Security Incident Taxonomy Working Group – RSIT WG https://github.com/enisaeu/Reference-Security-Incident-Taxonomy-Task-Force/ with extensions.

These old values are automatically mapped to the new ones:

‘botnet drone’ -> ‘infected-system’ ‘ids alert’ -> ‘ids-alert’ ‘c&c’ -> ‘c2-server’ ‘c2server’ -> ‘c2-server’ ‘infected system’ -> ‘infected-system’ ‘malware configuration’ -> ‘malware-configuration’ ‘Unauthorised-information-access’ -> ‘unauthorised-information-access’ ‘leak’ -> ‘data-leak’ ‘vulnerable client’ -> ‘vulnerable-system’ ‘vulnerable service’ -> ‘vulnerable-system’ ‘ransomware’ -> ‘infected-system’ ‘unknown’ -> ‘undetermined’

These old values can not be automatically mapped as they are ambiguous:

‘malware’: Either ‘infected-system’ or ‘malware-distribution’

Allowed values are:
  • application-compromise

  • backdoor

  • blacklist

  • brute-force

  • burglary

  • c2-server

  • compromised

  • copyright

  • data-loss

  • ddos

  • ddos-amplifier

  • defacement

  • dga domain

  • dos

  • dropzone

  • exploit

  • harmful-speech

  • ids-alert

  • infected-system

  • information-disclosure

  • data-leak

  • malware-configuration

  • malware-distribution

  • masquerade

  • misconfiguration

  • other

  • outage

  • phishing

  • potentially-unwanted-accessible

  • privileged-account-compromise

  • proxy

  • sabotage

  • scanner

  • sniffing

  • social-engineering

  • spam

  • test

  • tor

  • unauthorised-information-access

  • unauthorised-information-modification

  • unauthorized-command

  • unauthorized-login

  • unauthorized-use-of-resources

  • unprivileged-account-compromise

  • violence

  • vulnerable-system

  • weak-crypto

  • undetermined

DateTime

Date and time type for timestamps.

Valid values are timestamps with time zone and in the format ‘%Y-%m-%dT%H:%M:%S+00:00’. Invalid are missing times and missing timezone information (UTC). Microseconds are also allowed.

Sanitation normalizes the timezone to UTC, which is the only allowed timezone.

The following additional conversions are available with the convert function:

  • timestamp

  • windows_nt: From Windows NT / AD / LDAP

  • epoch_millis: From Milliseconds since Epoch

  • from_format: From a given format, eg. ‘from_format|%H %M %S %m %d %Y %Z’

  • from_format_midnight: Date from a given format and assume midnight, e.g. ‘from_format_midnight|%d-%m-%Y’

  • utc_isoformat: Parse date generated by datetime.isoformat()

  • fuzzy (or None): Use dateutils’ fuzzy parser, default if no specific parser is given

FQDN

Fully qualified domain name type.

All valid lowercase domains are accepted, no IP addresses or URLs. Trailing dot is not allowed.

To prevent values like ‘10.0.0.1:8080’ (#1235), we check for the non-existence of ‘:’.

Float

Float type. Without sanitation only python float/integer/long is accepted. Boolean is explicitly denied.

Sanitation accepts strings and everything float() accepts.

IPAddress

Type for IP addresses, all families. Uses the ipaddress module.

Sanitation accepts integers, strings and objects of ipaddress.IPv4Address and ipaddress.IPv6Address.

Valid values are only strings. 0.0.0.0 is explicitly not allowed.

IPNetwork

Type for IP networks, all families. Uses the ipaddress module.

Sanitation accepts strings and objects of ipaddress.IPv4Network and ipaddress.IPv6Network. If host bits in strings are set, they will be ignored (e.g 127.0.0.1/32).

Valid values are only strings.

Integer

Integer type. Without sanitation only python integer/long is accepted. Bool is explicitly denied.

Sanitation accepts strings and everything int() accepts.

JSON

JSON type.

Sanitation accepts any valid JSON objects.

Valid values are only unicode strings with JSON objects.

JSONDict

JSONDict type.

Sanitation accepts pythons dictionaries and JSON strings.

Valid values are only unicode strings with JSON dictionaries.

LowercaseString

Like string, but only allows lower case characters.

Sanitation lowers all characters.

Registry

Registry type. Derived from UppercaseString.

Only valid values: AFRINIC, APNIC, ARIN, LACNIC, RIPE. RIPE-NCC and RIPENCC are normalized to RIPE.

String

Any non-empty string without leading or trailing whitespace.

TLP

TLP level type. Derived from UppercaseString.

Only valid values: WHITE, GREEN, AMBER, RED.

Accepted for sanitation are different cases and the prefix ‘tlp:’.

URL

URI type. Local and remote.

Sanitation converts hxxp and hxxps to http and https. For local URIs (file) a missing host is replaced by localhost.

Valid values must have the host (network location part).

UppercaseString

Like string, but only allows upper case characters.

Sanitation uppers all characters.