Conversation
43b0941 to
9fcc372
Compare
9fcc372 to
bccd92c
Compare
| DrainData DrainData `json:"type,omitempty"` | ||
| OmitMetadata bool | ||
| InternalTls bool | ||
| LogFilter *LogTypeSet |
There was a problem hiding this comment.
I don't think that this has to be a pointer.
There was a problem hiding this comment.
I tried that at first, but in manager.go line 45 Binding is used as a map key:
sourceDrainMap map[string]map[syslog.Binding]drainHolder
My understanding is that maps in Go are not comparable, so adding LogTypeSet (which is a map[LogType]struct{}) to the Binding struct makes it non-comparable.
AFAICS the alternative to using a pointer would be to implement a custom comparable key for the map, but using a pointer seemed simpler to me.
| default: | ||
| // Unknown log type, skip it | ||
| // TODO add unit test for unknown log type | ||
| d.log.Printf("Unknown log type '%s' in log type filter, ignoring", logType) |
There was a problem hiding this comment.
If there is a log message here and an app owner wrongly configures the drain, than for every app instance and every app log, a log ,message will be written.
The filter settings should be validated upfront and we should not end up in this situation. At this point we should only have valid drain configuration.
I guess we have to bring in #633 first and than this change.
There was a problem hiding this comment.
My understanding is that this code does not run for every log message, but only parsing the syslog drain config. I intentionally left logging out of filtering_drain_writer.go to avoid creating a log for each app log.
There was a problem hiding this comment.
After discussing with @chombium we concluded that yhis log gets created for each wrongly configured value in the log types of each syslog drain. This is emitted every few seconds on each VM within cloud foundry. While this is better than for each log line, we would still keep log load at a minimum.
To address this I now made sure to log at most one log line in this method.
| set := make(syslog.LogTypeSet, len(logTypes)) | ||
|
|
||
| for _, logType := range logTypes { | ||
| logType = strings.ToLower(strings.TrimSpace(logType)) |
There was a problem hiding this comment.
The query string in the URL is case sensitive. We should only accept lowercase and not convert eVerY POssibLE string to lower case.
There was a problem hiding this comment.
Did so.
Still, I wonder if we should be strict here about this. Not caring about casing might bring us some benefits as this might be something that users accidentally misconfigure. It should not hurt much to support both I think.
|
@chombium I left a bunch of replies for points that are not yet clear to me. All the rest look good. Thanks for the thorough review! You found a bunch of nice things. |
jorbaum
left a comment
There was a problem hiding this comment.
Worked in some comments.
| set := make(syslog.LogTypeSet, len(logTypes)) | ||
|
|
||
| for _, logType := range logTypes { | ||
| logType = strings.ToLower(strings.TrimSpace(logType)) |
There was a problem hiding this comment.
Did so.
Still, I wonder if we should be strict here about this. Not caring about casing might bring us some benefits as this might be something that users accidentally misconfigure. It should not hurt much to support both I think.
jorbaum
left a comment
There was a problem hiding this comment.
Worked in most comments. Still 2 left to clarify.
| default: | ||
| // Unknown log type, skip it | ||
| // TODO add unit test for unknown log type | ||
| d.log.Printf("Unknown log type '%s' in log type filter, ignoring", logType) |
There was a problem hiding this comment.
After discussing with @chombium we concluded that yhis log gets created for each wrongly configured value in the log types of each syslog drain. This is emitted every few seconds on each VM within cloud foundry. While this is better than for each log line, we would still keep log load at a minimum.
To address this I now made sure to log at most one log line in this method.
|
Feel free to take another look at it. |
I noticed that as my changes depend on `RawQuery` not being empty, but actually they were.
382854c to
21d9380
Compare
21d9380 to
108740d
Compare
Also rename remaining instances of log type to source type
108740d to
ef16bed
Compare
| return true | ||
| } | ||
|
|
||
| if sourceTypeTag == "" { |
There was a problem hiding this comment.
The source_type is always set. Why is this check needed?
| type SourceType string | ||
|
|
||
| const ( | ||
| SOURCE_API SourceType = "API" |
There was a problem hiding this comment.
Add log somewhere in the name of the constants, so it is clear that these are for logs. Something like LOG_SOURCE_STG for example
| Expect(err).To(HaveOccurred()) | ||
| }) | ||
|
|
||
| Context("when source_type tag is missing", func() { |
There was a problem hiding this comment.
Can a source type be missing for log envelopes?
Please read the Loggregator API spec for Log Envelopes.
| } | ||
| }) | ||
|
|
||
| It("omits logs when source type include filter is configured with LOGS", func() { |
There was a problem hiding this comment.
This sounds odd: It omits logs when the include filter is configured.
Does the test test if the wanted logs are sent (included)? If an include filter is set , only these logs should be included
| Expect(fakeWriter.received).To(Equal(0)) | ||
| }) | ||
|
|
||
| It("sends logs when source type exclude filter is configured with LOGS", func() { |
There was a problem hiding this comment.
The same as above. The naming of the test is odd.
| } | ||
|
|
||
| // AllSourceTypes returns all valid source types | ||
| func AllSourceTypes() []SourceType { |
There was a problem hiding this comment.
Where is this function used?
| set := make(syslog.SourceTypeSet, len(sourceTypes)) | ||
|
|
||
| for _, sourceType := range sourceTypes { | ||
| sourceType = strings.TrimSpace(sourceType) |
There was a problem hiding this comment.
Do not trim, leading and trailing space will be url encoded as %20 or +. Nevertheless, if there is a space in the url that will be misformed url, an invalid Syslog Drain in our case.
| for _, sourceType := range sourceTypes { | ||
| sourceType = strings.TrimSpace(sourceType) | ||
| t, _ := syslog.ParseSourceType(sourceType) | ||
| set.Add(t) |
There was a problem hiding this comment.
What if a source type is invalid? Shouldn't that throw an error and mark the drain as invalid?
| } | ||
|
|
||
| func (d *DrainParamParser) getLogFilter(u *url.URL) *syslog.LogFilter { | ||
| includeSourceTypes := u.Query().Get("include-source-types") |
There was a problem hiding this comment.
IMO it would be better if we add log in the names of the parameters, so that they have clear descriptive names. Maybe something like include-log-source-types and exclude-log-source-types will be better...
| }, | ||
| { | ||
| name: "include-source-types=app", | ||
| url: "https://test.org/drain?include-source-types=app", |
There was a problem hiding this comment.
it should be types=APP. The urls are case sensitive. As the values of the source types are defined uppercase, I'm not sure if lower case should be accepted for the values.
Description
Addresses #711 :
?include-source-types=and?exclude-source-types(calling it source type instead of log type as log type is too generic and is already used in this code base)I still need to verify this, but I think a feature flag is not necessary as I tried to make string parsing more efficient by:
strings.IndexBytemight take some time, but it is used only with a single char/andstrings.IndexByte()should be more efficient thanstrings.Index()logTypePrefixesrather than in each log line, which should be more efficientRelated PRs:
Disclaimer: Claude helped me in crafting this code.
Type of change
Testing performed?
Checklist:
mainbranch, or relevant version branch