Notification of Server/Network outage

vitalsupport · November 13, 2017, 5:49pm

Comodo,

Please add a built-in feature (not a set of scripts or manually created monitors) that can email and possibly call with an automated IVR message when a server’s RMM agent is no longer communicating with the RMM platform. Furthermore, have it run a check to see if the other hosts at the same site are also not reporting and then provide information indicating that Internet or Network is down.

This would be very helpful for very small MSPs to get on top of outages before the client is calling about it. I miss that from Continuum. At the time I was using it they could only alert on the server being unreachable. I’d have to login to the portal to check if all the other internal hosts were down to figure out if it was likely an internet outage or a real server outage.

Thanks for listening,
-felipe

Anna_C · November 14, 2017, 12:07am

Hello @vitalsupport,

We have a ‘FEATURE REQUEST: Show Maintenance Notification in login Page’ that currently shows on the road-map for implementation in 2018-Q2 where MSPs/Customers will have a notification/display graphics when the maintenance will take place and how long it will be.
We have added you on the loop as well to keep you posted.

Thank you for your patience.

nct · November 14, 2017, 1:03am

Hi @Anna_C @vitalsupport is referring to servers he is managing, not C1 having an outage/fault warning displayed at login to C1.

Anna_C · November 14, 2017, 2:14am

Hello @nct,

Thank you for clarifying @vitalsupport’s request.
@vitalsupport: We have sent you an email with regard to your request.

Thank you.

vitalsupport · November 14, 2017, 3:50am

@nct Thanks for catching that misunderstanding and correcting it!

emrahsamdan · November 14, 2017, 7:32am

Hello @vitalsupport ,

First of all, I am very pleased to see the ideas arising from your side. We always want to hear our customers requests about our product.

For this case, I couldn’t get why the “Device Status” monitor is not ok for you. As you may know that you are already able to set a monitor for online/offline status of a device using that monitor. It can apply to group of computers or one of computers with respect to whom the profile is assigned (See attachment).

We are able to detect the OS version of the computer but we are not specifically assigning a server endpoint as the server of a group of endpoints. Therefore, we should first bring such feature to point an endpoint as a server of other endpoints in order to meet your needs. Is that right?

Thanks again for feedbacks.

Best regards,
Emrah
Product Manager for IT&SM Monitoring, Procedures and Patch Management

eztech · November 14, 2017, 5:38pm

I think I understand his question. Lets say the entire network of a client goes down. Will an alert still be sent that devices are down or the client network as a whole is down? From my testing, that does not happen. Only after internet/network comes back up do any alerts come through because its relying on the agent to sent the notification, vs the C1 portal checking for communication and alerting itself that it can’t communicate with the client. At least that is how its worked for me.

Ilker · November 15, 2017, 3:17am

Hi Felipe,

Good input. What other use cases you see as group alert other than site outage if any?

Hi @eztech

We have agent side and server side monitors. Online Offline monitor should be server side. Server should trigger the alert if can’t get heartbeat from device on expected interval. It would be great if you can submit a ticket to get that investigated.

Best regards,
Ilker

eztech · November 15, 2017, 4:59pm

Ok let me test again, if that is how it works then I must have something misconfigured. Thanks

mschubeck · November 15, 2017, 9:32pm

No offense to @vitalsupport but I’ve worked with Kaseya, Packet Trap, Labtech, Naverisk, Continuum, etc. and none of them would tell you if the entire site was down. We got around that by monitoring whether the edge router/firewall was up or not.

vitalsupport · November 16, 2017, 4:16am

@mschubeck Welcome to the conversation and no offense was taken. I realize that the “Device Status” monitor would at least provide a basic alert for a down server if enabled. You are right in that Continuum did not report that the whole site was down. I did state that I had to login to the Continuum portal to see the status to figure that out. Your method of checking the router/firewall would also work if it was being checked by a ITSM agent on LAN or had ICMP ping or remote management port open to check it’s status externally. I took the view of looking at all of the hosts at the site and determining that it’s not just a server outage due to being unreachable but that internet service was likely down because nothing at the site was reachable.

It would require other logic such as making sure that everything assigned to the site (group) all last reported from the same IP address. You’d want to skip counting roaming users whose’s last reported IP was different that the fixed site’s IP.

-felipe

mschubeck · November 16, 2017, 4:07pm

@vitalsupport I do agree that it would be nice to have that additional logic built-in to ITSM that could tell if an entire site is down or not. Assuming it would work correctly, one of the requirements should be to include the gateway device in the group. It would need to run like this:

Monitor: server is offline for 3 minutes (for example).
Check if all other servers included in group are also offline.
If no, create individual alerts for any servers that are offline. If yes, check if gateway is offline.
If gateway is online, create individual alerts. If gateway is offline, create a single “site down” alert.

The time offline before the process begins should be set by the admin.

eztech · November 18, 2017, 1:47am

@Ilker I apologize you are correct in how this works. The reason I didnt believe it as I’m having issues with alerts as per this thread
https://c1forum.comodo.com/forum/products/other-comodo-products/comodo-device-management/18406-keep-getting-alerts-with-old-dates-and-profiles

That is why I wasn’t getting them. Now if only that issue were fixed…

RT-AMS-ITarian · November 19, 2017, 3:34pm

Good ideas and features here, thanks all for sharing.

We monitor cliebts routers and have a script monitoring shutdown / reboot commands of servers, but being able to detect none scheduled or outages would be good.

RT-AMS-ITarian · January 5, 2018, 10:55pm

Has this got anywhere???

Would be good if It’s could let us know if it not get a heartbeat from a server or machines with the option enabled.

We have the online offline script but it only alerts when you run a shutdown or power on command not a crash, network drop etc