About 1/3 agents "error (timeout)" on Detect schedules since moving server to 10.1 and agents to 10.1

Since moving K1000 server to 10.1 and agents to 10.1, about a third of our clients fail their usual Detect schedules with "error (timeout)". It's not 100% always the same clients each time, but about 95% the same clients. I've seen a few of the "bad clients" fail, then succeed on the next, and then fail again. I'm not great at reading the KAgent logs but it looks like when Detects start, the client starts detecting, but never finishes.

I've tried these steps:

Uninstalling AV 

Forcing Server Retrust via KAT 

Uninstalling and reinstalling 10.1 agent

Manually deleting Patches folder

Running a Detect on just 10 of the bad clients, timeout set at 8 hours

and all fail, usually "timeout".

Talked with Support who eyeballed our K1000 settings. The last I heard there was "upgrade Win10 version and retry" (most clients, including the 100s that are successful, are on 1803 soon to move to 1909). However this had no effect (one of the bad clients is my own PC! which has been on 1903 for some time).

The one thing I can think of that this group of "bad" clients has in common is that all of them have been in place for several years, and have lived through several Kace server and agent updates. However, there are many clients that fit this description that Detect just fine. 

I really am at a loss and running out of things to try while I wait for Support to get back to me. 

2 Comments   [ + ] Show comments
  • Are you seeing any communication or dependency download related errors in your Kace agent logs? - Kiyolaka 4 years ago
    • Looking at the agent logs, we do not see any errors so far. It really does look like the agents are attempting the Detects, but for some reason some clients get done pretty quickly and others take the full 6 or even 8 hours and then time out.

      One thing Support had me try today and we're retesting, is altering the SmartLabels for the patches being detected -- I hadn't limited it to Active patches only. So far it doesn't appear to be making a difference -- the "good" clients get done in an hour, and the "bad" ones are still going, but maybe they will succeed before the 6 hours runs out. Will update later today or tomorrow. Thank you! - agibbons 4 years ago
  • How many patches are in the label?
    Are the devices local to the SMA or in a remote location? If remote, are you using replication? - KevinG 4 years ago
    • There were way too many patches in the label! After work with support editing the smartlabel, it was down to around 1000. I'm working on limiting the smartlabel further. I still don't quite understand why some PCs detected against this number of patches quickly, and others took a long time. All devices are local to the SMA. - agibbons 4 years ago

Answers (2)

Answer Summary:
Posted by: KevinG 4 years ago
Red Belt

Please use the Kace Agent Toolkit to collect the logs from a few  of these devices that are having an issue and attach to your support ticket.

This will help support with the troubleshooting process. Thanks

Using the KACE Agent Toolkit (KAT) (263376)

Posted by: agibbons 4 years ago
Purple Belt

Top Answer

I think the answer to this question was that the smartlabel I was detecting against had too many patches. After some editing of the label, detect success rate went way up. I still don't quite understand why some clients take far longer to detect than others (similar hardware, at the same location, etc) but at least it's quite improved.

This website uses cookies. By continuing to use this site and/or clicking the "Accept" button you are providing consent Quest Software and its affiliates do NOT sell the Personal Data you provide to us either when you register on our websites or when you do business with us. For more information about our Privacy Policy and our data protection efforts, please visit GDPR-HQ