So truncate would be better for deleting blank lines assuming you can be positive of the character count. Agreed that it would be better to eliminate the "duplicate line" workaround on server side but there is a challenge there as well. How would you distinguish a response that came back with fewer lines than the record limit, but that was only because it couldn't send all the records matching the last timestamp without exceeding the record limit? Its a non-trivial change and it also could break other scripts or supported tools that depend on how it currently works.
We are actively investigating backend improvements like starting to return data as soon as we have some of it from the backend database to avoid idle timeouts. Again non-trivial because the authentication and API front-end is independent of the middleware handling the backend db queries.
My script was never intended to be a substitute for supported tools, just a stopgap to fill a need while logging client was in development. The logging client is now released and fully supported but unfortunately still has some deficiencies that my script does not have with regard to handling network timeouts gracefully.
More than happy to publish any improvements you can make, but I simply do not have the time or bandwidth to do any necessary experimentation and testing. I did make a few minor changes for one large customer that wanted to treat each completed pull as a separate file rather than merging. This allowed them to save processing time and start post retrieval processing sooner, but they already had modified the script to meet other needs so I'm not sure the integration into the current public script would be simple and it certainly would need some thorough testing as the first try at modification of the customer script without thorough testing resulted in an epic fail. ;-(
Another thought. Using truncate could be nice speed improvement for deleting "useless lines" (preventing duplicates) if we could easily and efficiently get a count of bytes in all the trailing records that need to be deleted. Any ideas?
I'm still testing changes in the script to figure out if there any negative consequences.
I have a question about this line in the 5.3.1 version:
cp $qTmpFileName $qOrigEndCTime.part
I see this part file hasn't being referenced anywhere else in the script.
Interestingly, on one machine I've found a couple of 1652xxxxxx.part files that match missing gaps in SIEM, so my impression these part files should be somehow processed, but because of completeFile=false it doesn't get merged to a log later in the script.
Good catch the copy of the partial file was part of my debugging to cover the situation where a truncated file was received. I am fairly confident that the line can be commented out but if you do see files of that form it does indicate that there was an instance where records were returned but the file was incomplete relative to what the server was trying to send. If record limit was reached you would get two blank lines at the end. Something must have aborted the file transfer but curl still wrote what it had received to that point.
yes, this is a challenge. I worked with other cloud logs API implementations and one way would be to do more processing on the server side and to leave the client do an easy job. The server knows all the answers: how much logs there are, are there any events with the last timestamp that didn't fit the current batch etc. The server can just attach this "meta" information to the archive to make all the work easier.
Anyway many thanks for all your work and many many hours you put in the script!
UPDATE! Name change for pulling logs.
Please update your conf files to reflect the following changes:
The logging script configuration (the kb says mandatory but, my understanding is that msg.mcafeesaas.com will continue to resolve after December 1) but would be best to change sooner rather than later: From Web Gateway Cloud Service IP addresses and ranges (KB87232) :
So, the script configuration file should change msg.mcafeesaas.com to us.logapi.skyhigh.cloud
If you are pulling logs from other regions, they should change xx.msg.mcafeesaas.com to xx.logapi.skyhigh.cloud
Also don't forget to change your .netrc file to match the new host names. .netrc reference
Make sure the user has data analytics role in SSE (auth.ui.trellix.com).
@jebeling Thanks for this update. It's helpful for those of us not using the bash script too. Can you please place the KB number in the response so we have it for future use and send it to support teams who need more information?
New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.
Thousands of customers use our Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership: