Summary: | RFE: journald to send logs via network | ||
---|---|---|---|
Product: | systemd | Reporter: | Duncan Innes <duncan> |
Component: | general | Assignee: | systemd-bugs |
Status: | RESOLVED FIXED | QA Contact: | systemd-bugs |
Severity: | normal | ||
Priority: | medium | CC: | jokot3+freedesktop, myroslav |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Duncan Innes
2014-04-03 15:43:33 UTC
This feature already exists in systemd 212: http://www.freedesktop.org/software/systemd/man/systemd-journal-remote.html systemd-journal-remote is just the receiver side. The sending counterpart still hasn't left my local machine. So strictly speaking, this bug shouldn't be closed yet.
Also, there's a suggestion of json formatting and some extra tags. I don't think we need/want that, but maybe the reported explain the intended use case a bit more.
> The ability to add custom tags to JSON output would assist larger organisations.
So what exactly is the usecase here, and why aren't the _MACHINE_ID + _BOOT_ID fields enough?
Zbigniew, you're right. I misread the systemd-journal-remote man page and thought it did support push. At my company we use journal2gelf [1] to push messages. Of course, that pushes in GELF format, which is for Logstash aggregation, not journal aggregation. I'd be concerned about the performance implications of push aggregation to the journal right now. [1] https://github.com/systemd/journal2gelf Thanks for keeping this open. I was confused about what part did the sending and what did the receive. As for the output formats and extra tags - here goes. The use case for JSON formatting is to send logs to alternative aggregators (such as Logstash as mentioned in comment #3). The ability to receive logs in separated format rather than log lines makes it much easier for these systems to parse entries and stick them in whatever database is being used. The use case for extra tags I would say is similar to Puppet/Foreman hostgroups or classes. Systems know quite a lot about themselves which the log aggregator is going to have a hard time figuring out. Client systems know if they are dev, test, uat or production. Client systems know if they are in the DMZ (potentially) Database servers know that they are database servers Web servers know that they are web servers and so on . . . If each client can add some tags that provide context to the log entries, searches through logs can be made very much more useful. I could search for all IPTABLES denials on my web servers. I could search for all failed login attempts on my DMZ servers. Strictly speaking, the log comes from a single machine, but being able to group these machines arbitrarily (as happens naturally on a large estate) will allow an extremely powerful context search on the log database. Why not get the aggregator/parser/indexer to add these fields? These machines will not necessarily know all the details that the client might want to add. The client already knows these details, or can have them set via whatever config management tool is being used. Overall system loads will also be reduced by clients having a config entry that (for example) hard codes "cluster": "WebApp3" to be added to the log entries rather than having the aggregator performing repeated calculations or lookups on whatever LDAP, node classifier or other method is used. I don't mean to unduly extend the features of log shipping, but allowing a couple of output formats and some extra fields to be pushed would be a big benefit to large scale system users. Especially when the first point of inspection of aggregated logs is potentially a script/automated process rather than a SysAdmin. Going further, it would be possible to see the use for doing some parsing of log lines on the client. IPTABLES log entries triggered to parse and populate the fields for IN, OUT, MAC, SRC, DST, PROTO, TTL, DPT, SPT etc. rather than just all on the log message line. I'm struggling to think of other good examples (been parsing & searching IPTABLES logs all day and it's now late). Just a thought (perhaps more of a random thought), but I don't think functionality in this direction would go unused either. Final question: is there failover/load balancing ability on the cards for the remote sending? i.e. setting up 2 log destinations, possibly with round robin or plain failover when 1 destination is out of action? Would journald be capable of remembering the last successfully sent entry in event of all destinations being offline? Rather than buffering output to disk in event of network failure, just point to the last sent log entry and restart from there when the destinations become available. Too much for one bugzilla? Split out into 2 or more? Duncan (In reply to comment #3) > Zbigniew, you're right. I misread the systemd-journal-remote man page and > thought it did support push. The man page could probably use some polishing :) systemd-journal-remote supports pulling, but this support is rather primitive, and is certainly not enough for sustained transfer of logs. > At my company we use journal2gelf [1] to push messages. Of course, that > pushes in GELF format, which is for Logstash aggregation, not journal > aggregation. I'd be concerned about the performance implications of push > aggregation to the journal right now. Journald is fairly slow because it does a lot of /proc trawling for each message. When receiving messages over the network, all possible data is already there, so it should be reasonably fast. I expect HTTP and especially TLS to be the bottlenecks, not the journal writing code. Running benchmarks is on my TODO list. (In reply to comment #4) > The use case for JSON formatting is to send logs to alternative aggregators > (such as Logstash as mentioned in comment #3). The ability to receive logs > in separated format rather than log lines makes it much easier for these > systems to parse entries and stick them in whatever database is being used. Adding json support to systemd-journal-upload (the sender part, which is currently unmerged) would probably be quite simple... But for this to be useful, it has to support whatever protocol the receiver uses. I had a look at the logstash docs, and it seems that json_lines codec should work. I'm not sure about the details, but it looks like something that could be added without too much trouble. Maybe some interested party will write a patch :) > The use case for extra tags I would say is similar to Puppet/Foreman > hostgroups or classes. Systems know quite a lot about themselves which the > log aggregator is going to have a hard time figuring out. OK. This sounds useful (and easy to implement). (In reply to comment #6) > Final question: is there failover/load balancing ability on the cards for > the remote sending? So far no. > i.e. setting up 2 log destinations, possibly with round robin or plain > failover when 1 destination is out of action? > > Would journald be capable of remembering the last successfully sent entry in > event of all destinations being offline? Rather than buffering output to > disk in event of network failure, just point to the last sent log entry and > restart from there when the destinations become available. journald is not directly involved. It's a program totally separate from journald and it simply another journal client. It keeps the cursor of last successfully sent entry in a file on disk, and when started, by default, uploads all entries after that cursor and then new ones as they come in. > Too much for one bugzilla? Split out into 2 or more? No, it's fine. Sorry - should have come back to you on your last comment. Coding is not something I am particularly gifted in, so whilst I'm happy to give it a go, the result will probably be of a lower quality than you'd like. I'll have a look at the code though. Any pointers as to where to begin my search? From the Logstash end, there's a Jira ticket for a journald shipper: https://logstash.jira.com/browse/LOGSTASH-1807 I've commented a few times. Hopefully there can be some cooperation between these tickets to find a good solution. My view is that journald should implement the 'global' solution that can push to whoever is listening. The 3rd party aggregators can then write plugins (if necessary) to listen and pull this data stream into their systems. Did your code get off your laptop yet? I would be really interested in seeing systemd-journal-upload! Is there anything I can do to help get that completed? (In reply to comment #8) > Did your code get off your laptop yet? I pushed the code to systemd master today (commit http://cgit.freedesktop.org/systemd/systemd/commit/?id=3d090cc6f34e59 and surrounding ones). (In reply to comment #7) > > At my company we use journal2gelf [1] to push messages. Of course, that > > pushes in GELF format, which is for Logstash aggregation, not journal > > aggregation. I'd be concerned about the performance implications of push > > aggregation to the journal right now. > Journald is fairly slow because it does a lot of /proc trawling for each > message. When receiving messages over the network, all possible data is > already there, so it should be reasonably fast. I expect HTTP and especially > TLS to be the bottlenecks, not the journal writing code. Running benchmarks > is on my TODO list. Well, I was quite wrong here. It turns out that writing to the journal *is* the slow part. I'll probably publish some benchmarks on the mailing list tomorrow, but, essentially, writing to the journal is the most significant part, followed by TLS overhead. But if compression is turned on, things are much worse, because XZ compression was very slow. This patchset was delayed because I worked on adding LZ4 compression to the journal, which in turned caused other people to tweak XZ settings, improving compression speed greatly without significant loss of compression ratio. So in general, things improved on all fronts. With LZ4 compression, compression overhead should be less significant, since the speed is in the 500-1500 MB/s range, depending on the compressibility of data. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.