Skip to content

syslog_cef_receiver deviation from Go Collector syslog parsing of TAG in RFC 3164 #1729

@drewrelmas

Description

@drewrelmas

The Collector-Contrib syslogreceiver utilizes an underlying parser library that produces appname and proc_id attributes on Syslog RFC 3164:

Collector-Contrib code link

// parseRFC3164 will parse an RFC3164 syslog message.
func (p *Parser) parseRFC3164(syslogMessage *rfc3164.SyslogMessage, skipPriHeaderValues bool) (map[string]any, error) {
    value := map[string]any{
	"timestamp": syslogMessage.Timestamp,
	"hostname":  syslogMessage.Hostname,
	"appname":   syslogMessage.Appname,
	"proc_id":   syslogMessage.ProcID,
	"msg_id":    syslogMessage.MsgID,
	"message":   syslogMessage.Message,
    }
    ...

Internal parser dependency code link

	if sm.tag != "-" && sm.tag != "" {
		out.Appname = &sm.tag
	}
	if sm.content != "-" && sm.content != "" {
		// Content is usually process ID
		// See https://tools.ietf.org/html/rfc3164#section-5.3
		out.ProcID = &sm.content
	}
	if sm.message != "" {
		out.Message = &sm.message
	}

The RFC 3164 spec has this to say:

It has also been considered to be a good practice to include some
information about the process on the device that generated the
message - if that concept exists. This is usually the process name
and process id (often known as the "pid") for robust operating
systems. The process name is commonly displayed in the TAG field.
Quite often, additional information is included at the beginning of
the CONTENT field. The format of "TAG[pid]:" - without the quote
marks - is common. The left square bracket is used to terminate the
TAG field in this case and is then the first character in the CONTENT
field. If the process id is immaterial, it may be left off.

In that case, a colon and a space character usually follow the TAG.
This would be displayed as "TAG: " without the quotes. In that case,
the colon is the first character in the CONTENT field.

When the current Rust implementation gets a message like:

<134>{timestamp} securityhost myapp[1234]: User admin logged in from 10.0.0.1 successfully

It produces

LogRecord #0:
-> ObservedTimestamp: 1767725764424071771
-> Timestamp: 1767725764000000000
-> SeverityText: INFO
-> SeverityNumber: 9
-> Body: <134>Jan 06 18:56:04 securityhost myapp[1234]: User admin logged in from 10.0.0.1 successfully
-> Attributes:
-> syslog.facility: 16
-> syslog.severity: 6
-> syslog.host_name: securityhost
-> syslog.tag: myapp[1234]
-> syslog.content: User admin logged in from 10.0.0.1 successfully
-> Trace ID:
-> Span ID:
-> Flags: 0

Can/should we optimistically produce app_name (myapp) and proc_id (1234) attributes even on RFC 3164? One possible alternative is using later processors to transform the tag attribute.

Metadata

Metadata

Assignees

Labels

rustPull requests that update Rust code

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions