Thursday, 1 October 2015

Deploying Adobe Reader through SCCM

A nice tool for removing old Adobe Reader installs, before deploying a new version, by simply adding it to a package and linking its program to the new version of Adobe Reader and enabling "run this program first". It searches for previous versions of the software and removes all detected instances, which is followed by a fresh install of the new version. All silent and transparent.

To target only instances of Adobe Reader:

AdobeCleaner.exe /Silent /Product=1 /LogLevel=3


Adobe Reader and Acrobat Cleaner Tool

Friday, 6 March 2015

Install SCOM 2012 R2 on Windows 2012 R2 and MS SQL 2012 SP2

Use the SCOM sizing helper tool to calculate hardware resource requirements:

System Center 2012 Operations Manager Sizing Helper Tool

SCOM virtualization:

• For performance reasons, Microsoft recommends storing both the operational and warehouse databases on a directly attached physical hard drive and not a virtual disk.

• Hypervisor snapshots and writing changes to a temporary virtual hard drive are not supported.

• System Center 2012 R2 Operations Manager runs on virtual machines in Microsoft Azure just as it does on physical computer systems.

SCOM OS:

Server Operating System Requirements for System Center 2012 R2

• Installing System Center 2012 R2 Operations Manager on Windows Server 2012 Core requires:

• • Windows 32-bit on Windows 64-bit (WoW64) support, .NET 4.5, Windows PowerShell 3.0.

• • In addition, you need AuthManager. To install AuthManager for Windows 2012, add Server-Gui-Mgmt-Intra (the Minimal Server Interface.) To install for Windows 2012 R2, install AuthManager with this command: dism /online /enable-feature /featurename:AuthManager.

• Operations Manager does not support installing the 32-bit agent on a 64-bit operating system.

SCOM SQL DB:

SQL Server Requirements for System Center 2012 R2

• Operations Manager does not support hosting its databases or SQL Server Reporting Services on a 32-bit edition of SQL Server.

• Using a different version of SQL Server for different Operations Manager features is not supported. The same version should be used for all features.

• The SQL Server Agent service must be started, and the startup type must be set to automatic.

• The db_owner role for the operational database must be a domain account. If you set the SQL Server Authentication to Mixed mode, and then try to add a local SQL Server login on the operational database, the Data Access service will not be able to start.

• If you plan to use the Network Monitoring features of System Center 2012 R2 Operations Manager, you should move the tempdb database to a separate disk that has multiple spindles.

• If SQL is shared (e.g. with SCCM), SCOM will need its own SQL instance.

• If you are using a remote SQL server, do not install the SCOM 2012 Reporting Server component on the SCOM server, but run the SCOM setup on the SQL server and install it there.

• Although SQL Server Reporting Services is installed on the stand-alone server, Operations Manager reports are not accessed on this server; instead, they are accessed in the Reporting workspace in the Operations console. If you want to access published reports via the web console, you must install the Operations Manager web console on the same computer as Operations Manager Reporting server.

•  No other applications that are using SQL Server Reporting Services can be installed on this instance of SQL Server.

And also:

• Clustering of management servers is not supported in System Center 2012 R2 Operations Manager.

• To view Application Performance Monitoring event details, you must install the Operations Manager web console.

Important recommended limits:

Agent-monitored computers reporting to a management server: 3000
Agentless-managed computers per management server: 10
URLs monitored per dedicated management server: 3000
UNIX or Linux computers per dedicated management server: 500
Network devices managed by a resource pool with three or more management servers: 1000

SCOM prerequisites:

Microsoft System CLR Types for SQL Server 2012
The Report Viewer controls (Operations console)

.NET Framework 4
.NET Framework 4 > HTTP Activation
Windows PowerShell v2 or v3
Windows Remote Management (Management Server)
Windows Installer 3.1
Remote Registry service (Operation Manager reporting)

IIS 7.5 or later, with the IIS Management Console (Web Console):
• Default Document
• Directory Browsing
• HTTP Errors
• Static Content
• HTTP Logging
• Request Monitor
• Static Content Compression
• Request Filtering
• Windows Authentication
• ASP.NET 3.5
• ASP.NET 4.5
• IIS 6 Metabase Compatibility

Installation of the web console requires that ISAPI and CGI Restrictions in IIS are enabled for ASP.NET 4. To enable this, select the web server in IIS Manager, and then double-click ISAPI and CGI Restrictions. Select ASP.NET v4.0.30319, and then click Allow.

You must install IIS before installing .NET Framework 4. If you installed IIS after installing .NET Framework 4, you must register ASP.NET 4.0 with IIS. Open a Command prompt window by using the Run As Administrator option, and then run the following command:

%WINDIR%\Microsoft.NET\Framework64\v4.0.30319\aspnet_regiis.exe -r

SCOM SQL requirements:

Microsoft SQL Server Database Engine
Microsoft SQL Server Full Text Search
Microsoft SQL Server Reporting Services
Microsoft SQL Server Management Console

SCOM service accounts:

SCOM Action account
SCOM SDK account
SCOM Reporting account
SCOM SQL account

Problems:

An SQL instance for SSRS is missing: Ensure the SCOM Reporting Server component is being installed on the SQL server and not on SCOM itself (unless it runs SQL locally) and the SSRS is not used by any other application.



How to Install the Operations Manager Reporting Server
Preparing your environment for System Center 2012 R2 Operations Manager

Saturday, 28 February 2015

Logon script for importing SCCM PowerShell module


#Check if the PS profile already exists:
$path = "$env:USERPROFILE\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1"

#If the profile does not exist, create it:
If(-not(Test-Path -path $path))
  {
    #Create WindowsPowerShell directory for currently logged user.
    New-Item $env:USERPROFILE\Documents\WindowsPowerShell\ -Type Directory

    #Create PowerShell profile for currently logged user.
    New-Item $env:USERPROFILE\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1 -Type File

    #Import SCCM PowerShell module for for currently logged user.
    Add-Content -Value "Import-Module '$($Env:SMS_ADMIN_UI_PATH | Split-Path -Parent)\ConfigurationManager.psd1'" -Path $Profile

    #Set the SCCM site location for currently logged user.
    Add-Content -Value "Set-Location MB1:" -Path $Profile
  }
#Otherwise, do nothing.
Else
 { exit }

Wednesday, 21 January 2015

Test MS SQL connectivity

A simple and quick way to check SQL connectivity: create a text file on the client computer, change its extension to udl, save and open it. The "Select or enter a server name" dropdown menu lists all visible instances of available servers.


Hidden SQL instances:

All SQL instances are listed under:

HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\

MSSQLxx.instancename

To identify the hidden SQL instances, check the following registry key:

HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.instancename\MSSQLServer\SuperSocketNetLib\HideInstance

1 = Hidden
0 = Not hidden/browsable

Discovery Wizard for SQL Server
Hide an Instance of SQL Server Database Engine
SQLPing
Test remote SQL connectivity EASILY!

Thursday, 15 January 2015

Install SCCM 2012 R2 on Windows Server 2012 R2 and MS SQL 2012 SP1

Install features:

.Net Framework 3.5 > HTTP Activation
.Net Framework 3.5 > Non-HTTP Activation
.Net Framework 4.5 > HTTP Activation
.Net Framework 4.5 > TCP Activation
Background Intelligent Transfer Service
Remote Differential Compression

Install IIS components:

Windows Authentication
Application Development > ASP.NET 3.5
Management Tools > IIS 6 Management Compatibility > IIS 6 WMI Compatibility

Create users:

SQL Agent
SQL DB Engine
SQL Reporting

CM Network Access
CM Admin

Install SQL server:

DB Engine Services
Reporting Services
Management Tools

Extend AD schema:

Create a backup of the DC holding Schema FSMO role.
Run CMD as admin > cd cd-drive:\SMSSETUP\BIN\X64\ > extadsch.exe
Check the log for any issues: C:\ExtADSch.log
 

The Active Directory schema extensions for ConfigMgr 2012 are unchanged from those used by Configuration Manager 2007. If you extended the schema for Configuration Manager 2007, you do not have to extend the schema again for ConfigMgr 2012. ConfigMgr 2012 uses the Windows Active Directory (AD) environment to support many of the features it provides and can publish information to AD about sites and services. In this manner, the AD clients of ConfigMgr 2012 have this information easily accessible, but in order to use this feature the AD schema has to be extended in order to create the objects and the classes specific to ConfigMgr 2012. Extending the schema is not required for the installation of ConfigMgr 2012 but it is recommended.


Extending the Active Directory Schema for ConfigMgr 2012 allows clients to retrieve many types of information related to Configuration Manager from a trusted source. In some cases, there are workarounds for retrieving the necessary information if the Active Directory schema is not extended, but they are all less secure than querying Active Directory Domain Services directly. Additionally, not extending the schema might incur significant workload on other administrators who might need to create and maintain the workaround solutions such as logon scripts and Group Policy objects (GPO) for computers and users in your organization. The Active Directory schema can be extended before or after running ConfigMgr 2012 Setup, however as a best practice, it’s best to extend the schema before you run Configuration Manager 2012 Setup. You have to extend the Active Directory schema only once for the forest that contains site servers; you do not have to extend the schema again if you upgrade the operating systems on the domain controllers or after you raise the domain or forest functional levels. Similarly, if you extended the schema for ConfigMgr 2012 with no service pack, you do not have to extend the schema again for ConfigMgr 2012 SP1.

Extending the Active Directory schema is a forest-wide action and can only be done one time per forest. Extending the schema is an irreversible action and must be done by a user who is a member of the Schema Admins Group or who has been delegated sufficient permissions to modify the schema. If you decide to extend the Active Directory schema, you can extend it before or after setup. Only after the schema is AD extended and the steps needed to publish the ConfigMgr 2012 site information to AD, ConfigMgr 2012 can publish information to AD.

You can extend the ADSchema using either the extadsch.exe tool or the ConfigMgr_ad_schema.ldf file.When using the ldf file you will need to edit and configure this file.

Set permissions on System Management object in AD:

ADSIEdit > Default naming context > CN=System > create a new container CN=System Management > edit its properties

Add SCCM 2012 computer object to ACL > grant it full control > under advanced options set “This object and all descendent objects” in “Applies to:”.

After the schema has been extended with the classes and attributes that are required for Configuration Manager, create a System Management container in the System container in each site server's domain partition in Active Directory Domain Services. Because domain controllers do not replicate their System Management container to other domains in the forest, you must create a System Management container for each domain that hosts a Configuration Manager site.

Configuration Manager does not automatically create the System Management container in Active Directory Domain Services when the schema is extended. The container must be created one time for each domain that includes a Configuration Manager primary site server or secondary site server that publishes site information to Active Directory Domain Services.

After you have created the System Management container in Active Directory Domain Services, you must grant the site server's computer account the permissions that are required to publish site information to the container. The site server computer account must be granted Full Control permissions to the System Management container and all of its child objects. If you have secondary sites, the secondary site server computer account must also be granted Full Control permissions to the System Management container and all its child objects.

Install Windows Assessment and Deployment Kit 8.1:

Under select features, either leave everything selected (~6.5GB) or untick SQL Express and optionally ACT, VAMT, WPT and WAS.

Run SCCM 2012 prerequisite checker (optional):

Run CMD as admin > cd cd-drive:\SMSSETUP\BIN\X64\ > prereqchk.exe /pri /sql sql_fqdn /sdk sms_provider_fqdn>c:\sccm-prereqchk.log

/pri – primary site
/cas – cas site
/sec – secondary site
/sdk – sdk server
/adminUI – GUI

Install SCCM:

Download SCCM 2012 prerequisites
Select what to install/upgrade
Set the site code and site name
Set the SQL server/instance FQDN and DB name
Set the SMS provider FQDN
Select the security option for client/server communication (HTTP/HTTPS)
Set FQDNs for management and distribution points.

Install Microsoft Deployment Toolkit 2013:

Run the MDT integration wizard - Configure ConfigMgr Integration.

Problems:

If you get an error "Setup could not install SQL RMO, ConfigMgr installation cannot be completed", you may need to reboot the server to complete the installations of SCCM prerequisites.


If you try to reinstall SCCM using the same SQL server, Prerequisite Check may fail on "Dedicated SQL Server instance", so you'll need to detach and remove the old CM database and delete the registry key HKLM\SOFTWARE\Microsoft\SMS\ on the SQL server.


.

Tuesday, 17 December 2013

Configuring SCOM 2007 agent in a workgroup (on a Lync edge server)


Here are some interesting bits on fixing the issues related to setting up communication between an agent on a Lync edge server in a workgroup and a domain-based SCOM server.

Computer name vs full computer name

Configuration of a Lync edge server includes setting up the primary DNS suffix, as explained here:

After configuring the DNS suffix add routes to Edge server. Tab to change the computer name click Change, in Full computer name click More and add Primary DNS suffix of this computer: the suffix of the Active Directory Domain Services.


This adds the suffix to the computer name and forms the full computer name that looks like an FQDN. This FQDN-like name has to be used as a common name of the server when creating a certificate that will be used for communication with SCOM.

No primary DNS suffix in the CN of the certificate:

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 21007
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
The OpsMgr Connector cannot create a mutually authenticated connection to SCOM-server.domain.com because it is not in a trusted domain.

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 21016
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
OpsMgr was unable to set up a communications channel to SCOM-server.domain.com and there are no failover hosts. Communication will resume when SCOM-server.domain.com is available and communication from this computer is allowed.

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 21021
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
No certificate could be loaded or created. This Health Service will not be able to communicate with other health services. Look for previous events in the event log for more detail.

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 20052
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
The specified certificate could not be loaded because the Subject name on the certificate does not match the local computer name
Certificate Subject Name : Lync-edge
Computer Name : Lync-edge.domain.com


In CA server, create and import a new certificate. Set the 'Subject name' to CN=computername.domain.com, and the 'Friendly name' to computername.domain.com.

A certificate is required for both SCOM server and a non-domain member, and it needs to be imported with MOMCertImport.exe /SubjectName <FQDN> on both sides.

Step by Step for using Certificates to communicate between agents and the OpsMgr 2007 server

If the certs are okay, but the new agent has not been approved in SCOM:

Log Name: Operations Manager
Source: OpsMgr Connector
Date: 16/12/2013 11:37:52 AM
Event ID: 20070
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: Lync-edge.domain.com
Description:
The OpsMgr Connector connected to SCOM-server.domain.com, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 21016
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
OpsMgr was unable to set up a communications channel to SCOM-server.domain.com and there are no failover hosts. Communication will resume when SCOM-server.domain.com is available and communication from this computer is allowed.

Approve the new agent in SCOM.

Step by Step for using Certificates to communicate between agents and the OpsMgr 2007 server

If the new agent's host is missing the root certificate:

Log Name: Operations Manager
Source: OpsMgr Connector
Event ID: 20067
Task Category: None
Level: Warning
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
A device at IP x.x.x.x:5723 attempted to connect but the certificate presented by the device was invalid. The connection from the device has been rejected. The failure code on the certificate was 0x800B010A (A certificate chain could not be built to a trusted root authority.).

Import the root cert at the workgroup computer.

Step by Step for using Certificates to communicate between agents and the OpsMgr 2007 server

If the AD integration option is on (on the agent's host):

Log Name: Operations Manager
Source: HealthService
Event ID: 2010
Task Category: Health Service
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
The Health Service cannot connect to Active Directory to retrieve management group policy. The error is Unspecified error (0x80004005).


Turn off AD integration:

In the agent's registry, go to HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\ConnectorManager.
Set EnableADIntegration in the registry to ‘0′.
Restart the HealthService.

SCOM Workgroup Monitoring – Disable AD Integration

If the netework service account (used in the run as profile for “Microsoft Lync Server 2013 Remote Watcher Profile for Discovery”) cannot access the Lync server or its components (e.g. a database):

Alert: An internal exception has occurred during discovery.
Source: Discovery Script on Lync-edge.domain.com
Alert description: Discovery did not succeed. Monitoring may fail if discovery data's initial state was not available. Please check alert context for details.

Log Name: Operations Manager
Source: Health Service Script
Event ID: 223
Task Category: None
Level: Error
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
DiscoverMachine.ps1 :

--------------------------------------------------------------------------------
-Script Name: Lync Server MP Machine Topology Discovery
-Run as account: nt authority\network service
-Execution Policy: Bypass
--------------------------------------------------------------------------------
Value of Source Id is {2469342F-3092-2CD4-2CE3-D45CA920984C}.
Value of ManagedEntity Id is {DBB1C579-0999-9D12-7B08-AC6C479AE328}.
Value of Target Computer is Lync-edge.domain.com.
Lync Server Module is added
Successfully initialize discovery data.
An exception occurred during discovery script, Exception : Could not connect to SQL server : [Exception=System.Data.SqlClient.SqlException (0x80131904): Cannot open database "xds" requested by the login. The login failed.
Login failed for user 'NT AUTHORITY\NETWORK SERVICE'.




Either add the Network Service account to the local group RTC Component Local Group on the Lync server or modify the Run As account used by 'Microsoft Lync Server 2013 Remote Watcher Profile for Discovery' in SCOM.

SCOM 2012 Lync Server 2013 Management Pack discovery error

After adding the Network Service account to the local group RTC Component Local Group on the Lync edge server:

Log Name: Operations Manager
Source: HealthService
Event ID: 7028
Task Category: Health Service
Level: Information
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
All RunAs accounts for management group SCOM management group have the correct logon type.

Log Name: Operations Manager
Source: HealthService
Event ID: 7024
Task Category: Health Service
Level: Information
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
The Health Service successfully logged on all accounts for management group SCOM management group.

Log Name: Operations Manager
Source: HealthService
Event ID: 7025
Task Category: Health Service
Level: Information
Keywords: Classic
Computer: Lync-edge.domain.com
Description:
The Health Service has authorized all configured RunAs accounts to execute for management group SCOM management group.


Verifying communication between the SCOM server and a workgroup agent:

Check agent/server connectivity (it should say ESTABLISHED):
On Lync server: netstat –a | findstr SCOM-server
On SCOM server: netstat –a | findstr Lync-edge

More info:
Installing Lync 2013 Edge Server
Obtaining Certificates for Non-Domain Joined Agents Made Easy With Certificate Generation Wizard
Setup Lync Server Components wizard returns the warning "Host not found in topology" during the Lync Server Edge server installation
Step by Step for using Certificates to communicate between agents and the OpsMgr 2007 server
When you try to install a System Center Operations Manager agent on a workgroup computer withoutusing a gateway server, Operations Manager cannot see the workgroup computer

Monday, 25 March 2013

SCOM 2007 R2: Notifying about SQL jobs that failed to run


A SQL job may fail to run (start) if there is a problem with the job owner’s account (e.g. it was deleted) or there is a database connectivity problem (e.g. DB was removed) or a login for the SQL agent account was not created or granted necessary rights. By default, SCOM 2007 R2 with SQL Server MP will not alert and notify on SQL jobs that failed to run. Database Backup Failed To Complete Rule alerts on failed backup jobs (e.g. a job started, but then it failed for some reason), but will not alert on jobs that failed to run.

In How to monitor SQL Agent jobs using the SQL Management Pack and OpsMgr Kevin Holman explained how to discover all SQL jobs and enable An SQL job failed to complete successfully Rule in order to alert on all kind of issues, but I haven’t tried it and I’m not sure if this rule actually alerts on jobs that failed to run.

Failed-to-run jobs are logged in Windows application event log as Warning events with ID: 208.

SQL 2005 on Windows 2003:
Log Name: Application
Event Type: Warning
Event Source: SQLAgent$SQL2005
Event ID: 208

SQL 2008 R2 on Windows 2008 R2:
Log Name: Application
Source: SQLSERVERAGENT
Event ID: 208
Level: Warning

So I created 2 Simple Event Detection monitors with Timer Reset, one for SQL 2005 and one for 2008, enabled alerting and configured notifications to log these issues as incidents in SCSM.

For Monitoring Target, make sure to select SQL 2005 DB Engine and SQL 2008 DB Engine.


Event Source ==> Matches wildcard: $Target/Property[Type="MicrosoftSQLServerLibrary614000!Microsoft.SQLServer.DBEngine"]/AgentName$

Thursday, 22 November 2012

SCOM 2007 R2: Monitoring and restarting MS SQL backup jobs


As the monitor looks for failed SQL backup job events in Windows application log, it’s a good idea to selectively target the servers and minimise the impact on the server infrastructure. There are 2 ways to achieve this:

1. Create a Windows Simple Event Detection monitor, disable it, and create an override to enable monitoring for a specific group only.

Why targeting a computer group fails?
SCOM 2007: Target a rule or monitor to a computer group

2. Use Authoring Console to create a new class that can be used for discovering servers.

How to monitor a service with unique names across multiple computers using a wildcard?
SCOM: Monitor a custom Services on Windows server 2003

Even though the second option (create a new class) seems a better and cleaner way to go, it’s much easier to manage the membership of a group than to retarget a class (reconfigure discovery). However, I wasn’t able to get a simple recovery task to work when targeting a group through an override (the 1st option), so I proceeded with the 2nd option and created a new class.

I simulated network connectivity issues to find out what happens with an SQL backup job running on SQL 2005 when the network goes down. The following events were logged in the Windows application log:  

Event Type: Error
Event Source: SQL Server service name 
Event ID: 18210
Date: 8/11/2012
Time: 1:25:06 PM 

Event Type: Error
Event Source: SQL Server service name 
Event ID: 3041
Date: 8/11/2012
Time: 1:25:06 PM 

Event Type: Error
Event Source: SQL Server service name
Event ID: 3633
Date: 8/11/2012
Time: 1:25:27 PM 

Event Type: Error
Event Source: SQLISPackage
Event ID: 12291
Date: 8/11/2012
Time: 1:25:27 PM  

Event Type: Warning
Event Source: SQL Server Agent service name
 Event ID: 208
Date: 8/11/2012
Time: 1:25:27 PM
Description: SQL Server Scheduled Job 'Job name' …

The events are always logged in this order, so the last one logged was the warning with ID 208. I configured a monitor to look for the warning event.

If the job completes successfully, the following 2 info events are logged:

Event Type: Information
Event Source: SQL Server service name
Event ID: 18264
Description: Database backed up. Database: ...

Event Type: Information
Event Source: SQLISPackage
Event ID: 12289
Description: Package "job name" finished successfully.

These could be used to close the alert.

Create a new class and discovery, save it as a new MP and import in SCOM. Multiple discoveries can be configured in case a small number of servers needs to be discovered. Otherwise, a more generic filtering approach can be used as explained in

How to monitor a service with unique names across multiple computers using a wildcard

To target several servers by their computer names, simply create a discovery for each server and use FiltereRegistryDiscoveryProvider and the registry key “ComputerName” to discover the specific servers. Set the discovery frequency to not less than 4 hours (24 hours would probably be okay).

Then create a new monitor. I used timer reset (2 mins), but Event ID 12289 “Package "job name" finished successfully “ can be used as well.

The recovery task will run even if an alert previously raised hasn’t been resolved. I tested this by setting timer reset to 15 minutes and then manually running a backup job and simulating network outage several times within 5 minutes. The task restarted the job every time and almost instantly (as soon as the NIC was back).

Store the new monitor in the same MP used for storing the new class, in order for the class to appear in the target list.

Monitor target: your new class
Event Log: Application
Event Expression:
   Event ID: Equals => 208
   Event Source: Equals => SQL Server service name
   EventDescription: Matches wildcard => Job or Maintenance Plan name

As SQL MP alerts on failed backup jobs using the rule “Database Backup Failed To Complete” which reports on Event ID 3041, it’s probably not necessary to turn alerting on for the new monitor.

Create a new “run script” recovery task and use the VB script below. Modify the server name and the job name. Use only the first part of the job name (e.g. without ‘.subplan_1’).

I tested OSQL.exe with a “run command” task, but it didn’t work. The tool worked without problems when executed manually or through a script locally on the DB server. It’s probably possible to configure it using a batch file and a scheduled task, but I haven’t tried that.

Different ways to execute a SQL Agent job
On Error Goto 0: Main() 
Sub Main() 
   Set objSQL = CreateObject("SQLDMO.SQLServer") 
   ' Leave as trusted connection 
   objSQL.LoginSecure = True 
   ' Change to match the name of your SQL server 
   objSQL.Connect "your server name" 
   Set objJob = objSQL.JobServer 
   For each job in objJob.Jobs 
      if instr(1,job.Name,"your job name") > 0 then 
         ' msgbox job.Name 
         job.Start  
         ' msgbox "Job Started"
      end if 
   Next 
End Sub

Tuesday, 30 October 2012

Failed to send a notification through sms (SCOM 2007 R2)

SCOM notification for high priority and critical severity events was set to alert through SMS and email channels. This worked well, but occasionally, SCOM would generate an alert about not being able to send a notification through SMS gateway while email worked all the time.

SCOM sends 2 notifications about the SMS-ing issue:

1. Failed to send notification using server/device

Notification subsystem failed to send notification using device/server 'Standard  9600 bps Modem' over 'SMS' protocol to 'phone_number'.

2. Failed to send notification

Notification subsystem failed to send notification over 'SMS' protocol to 'phone_number'.

I tried adding the prefix “+ country code” as advised by Scott in How to configure SCOM 2007 to send SMS messages, but it did not help.

Then I checked the modem for speed configuration and status, and found no issues with it. Here is a useful document with loads of commands for configuring and checking modems:

GPRS AT Commands

Some examples:

AT+CPAS
Description: Returns the activity status of the mobile equipment. (p. 17)

AT+CSQ
Description: This command determines the received signal strength indication (<rssi>) and the channel bit error rate (<ber>) with or without a SIM card inserted. (p. 31)

AT+CBST?
Description: This command applies to both outgoing and incoming data calls, but in a different way. For an outgoing call, the two parameters (e.g., <speed> and <ce>) apply; whereas, for an incoming call, only the <ce> parameter applies. (p. 93) [check the speed configuration]

Since the (network) modem looked okay and the parallel email communication was working without issues, we concluded that there may be intermittent issues with the mobile network. So we decided to filter out the SMS related issues from the ‘Failed to send notification’ alert.

There are 2 alert generating NT event log monitoring rules:

1. Failed to send notification alerting rule

This rule creates alert every time notification subsystem fails to send notification using all configured devices/servers

Event log: Operations Monitor
Event Source: Health Service Modules
Parameter 1: $Target/ManagementGroup/Name$
Event ID: 31505

2. Failed to send through device alerting rule

This rule creates alert every time notification subsystem fails to send notification through certain device/server

Event log: Operations Monitor
Event Source: Health Service Modules
Parameter 1: $Target/ManagementGroup/Name$
Event ID: 31503

As these cannot be modified, they need to be disabled through an override. Then 2 new rules need to be created and configured for additional filtering.

Failed SMS notifications can be filtered out by using the parameter 5 that sets the word ‘SMS’ in the notification description in the alerts from both alerts. You can find the parameter in the rules’ Properties > Configuration > Response (view) > Alert Description.



The parameters can also be found under the events’ tab ‘Details’, as well as using log parser as explained by Marcus in Adjusting “failed to send notification using server/device”.

Once the original rules have been disabled, create 2 new rules and save them in your custom MP. Copy the configuration from the original rules:

Rule category: Alert
Rule target: Notification server

Under Configuration > Data sources > Expression insert a new line, set the Parameter Name to Parameter 5, Operator to Does not equal, and Value to SMS.


Copy the Alert suppression settings from Configuration > Responses > Alerting > Alert suppression. Modify priority and severity if necessary.

Thursday, 5 July 2012

SCOM 2007: Set up a recovery task for a process running under specific credentials

If you have a process running under a specific service account, and you want to monitor it and automate a recovery action in case it crashes, here is an easy way to do it without exposing the credentials.

Find out how to get the process up and running directly from Windows (e.g. by running an exe file or a script) and create a scheduled task for it, without actually scheduling it. Set the task to run under the proper credentials and select ‘Run whether user is logged on or not’.

In SCOM, configure monitoring of the process and set up a recovery task in the monitor’s properties. Use the tool ‘schtasks.exe’ to run the task created earlier.

Set these under the Command Line tab:

Full path to file:
C:\Windows\System32\schtasks.exe

Parameters:
/run /s servername /tn taskname

Note: This won't work for every application (e.g. IPFX telephony application). You need to test it by manually terminating a process and confirming that the task is able to recover it. Also, make sure to check the application and process behaviour following a server reboot.

Wednesday, 27 June 2012

SMS-ing meaningful messages from SCOM 2007

I tried to set up SMS notification in SCOM 2007 R2 with minimum data necessary for the message to have some value, including:

1. Resolution state
2. Device name or IP address
3. Basic description

This should do the job:

$Data/Context/DataItem/ResolutionStateName$ Issue: $Data/Context/DataItem/ManagedEntityPath$\$Data/Context/DataItem/ManagedEntityDisplayName$ - $Data/Context/DataItem/AlertName$

Unfortunately, the limit for default encoding is 160 and for Unicode only 80 characters, so it didn't work.

Eventually, a colleague of mine came up with an elegant solution:

Create 2 SMS channels and set the following text messages (hard-coded resolution state + source + alert name (basic description)):

1. New - $Data/Context/DataItem/ManagedEntityPath$\$Data/Context/DataItem/ManagedEntityDisplayName$ $Data/Context/DataItem/AlertName$

 2. Closed - $Data/Context/DataItem/ManagedEntityPath$\$Data/Context/DataItem/ManagedEntityDisplayName$ $Data/Context/DataItem/AlertName$

Then create 2 SMS subscriptions:

1. Under subscription criteria, select your standard filtering and in addition to that select "with specific resolution state" and set the state to New and use the 1st channel.

2. Under subscription criteria, select your standard filtering and in addition to that select "with specific resolution state" and set the state to Closed and use the 2nd channel.

Thursday, 23 February 2012

Unable to open SCOM Web Console from a PC

RMS – SCOM 2007 R2 RU5 on Windows 2008 R2 SP1
SCOM Reporting - SQL 2008 R2 on Windows 2008 R2 SP1
IE9 on Windows 7

I installed the SCOM web console on RMS and was able to access it from SCOM itself, but when I tried to open it from my PC, I got a logon prompt which did not accept my plain and domain admin users credentials.


After 3 attempts, IE displays a message:

You do not have permission to view this directory or page

Kevin Holman explains in his article Installing the Web Console on a 2008 Management Server - using Windows Authentication that we need to set the following properties in the AD computer account of the server running the SCOM web console:

For Windows 2008: Trust this computer for delegation to any service (Kerberos)


For Windows 2003: Trust this computer for delegation to specified service only, then select Use Kerberos only and add the SDK account.

But this did not fix the issue in my case. I found in this thread on MS discussions that the identity of SCOM's application pool needs to be changed to the SDK account, but when I tried it, I got an error:

Bad Data (Exception from HRESULT: 0x80090005)


I found the explanation of the issue in Caution while xcopying IIS 7.0 config files and tried setting the identity to Local System account and recycling the application pool, and that finally fixed the issue.

Monday, 13 February 2012

Tuning Heartbeat alerts in SCOM 2007 R2

What is heartbeat and how it works?

Once a SCOM agent is deployed on a host, e.g. Windows server, it establishes a connection with SCOM MS and sends a Heartbeat packet to it at the specified interval. The purpose of this communication is to let the SCOM MS know that the agent is alive and working, and that the Health service is up and running at the agent’s side. It actually does not report on the health of the server itself or the network link status. In case the Heartbeat packet has not been received for the timeframe defined by “Number of missed Heartbeats allowed” x "Heartbeat interval (seconds)", an alert will be generated to inform that there is a problem with the agent:

Alert: Health Service Heartbeat Failure

Following this, a diagnostic ping will be issued by the SCOM RMS in order to check if the server itself is available and responsive. The ping is a single ICMP packet without any calculations as in a regular ping with ping.exe. If it fails, an alert will be generated:

Alert: Failed to Connect to Computer

This alert actually informs that the server failed to respond to a ping, either due to network or software/hardware issues.


Tuning Heartbeat

By default, a Heartbeat check is set to run at 60 seconds intervals and SCOM MS will tolerate 3 missed responses. If the 4th one is missed as well, the SCOM will generate an alert.

The number of missed heartbeats can be overridden at the management server level and heartbeat interval can be overridden at the agent level.

Heartbeat monitoring can be disabled for all agents or for the following specified agents:

• That connect to the network intermittently.
• That connect to the network over poor connections or use dial-up connections.
• On systems that are frequently restarted.

If there are intermittent issues with the communication between the SCOM server(s) and agents, and it does not affect the end users, alerting can be supressed by decreasing the interval value to something like 15 seconds and increasing the missed responses value to something like 16. This way, the total allowed time-out will still be close to the default value (new: 4.15 mins vs original: 4 mins), but it will allow for more frequent communication attempts which might reset the heartbeat failure counter before an alert is generated.

For more critical servers, the Heartbeat interval can be set to 10 seconds or so, which would result in more aggressive monitoring.

In case these alerts generate a lot of noise after business hours, and alert priority and/or severity is used for alert filtering, it is also possible to create an override and change the following values for the monitor Health Service Heartbeat Failure, so they are not marked as critical events and notifications sent to the after-hours support guys.

Alert Priority: from High to Medium
Alert Severity: from Critical to Warning

Authoring > Monitors > search for Health Service Heartbeat Failure > expand it and under Health Service Watcher (Agent) > Entity Health > Availability, right click Health Service Heartbeat Failure and select Overrides > Override the Monitor > For all objects of class: Health Service Watcher (Agent).

Now, this might cause certain issues as described here when a server gets stuck at loading Windows and responds to pings. To address this, agent status should be examined on a regular basis in the monitoring console as the agents that have issues communicating with the SCOM server will turn grey.


More info:
Heartbeat and Heartbeat Failure Settings in Operations Manager 2007
Health Service Heartbeat Failure, Diagnostics and Recoveries