Topics Archive - Experda

Introduction to Experda Software

Experda provides you with the most efficient management and control tools, displaying all of the organization’s MS SQL servers and databases in a unified, graphical, and intuitive management interface. Thus, the DBA can devote his time to the complex and important tasks at hand without wasting precious time on routine operations and the intricacies involved in database management and performance improvement on servers.

Furthermore, Experda significantly streamlines the DBA’s work. The system assists in improving database performance, reducing the frequency of glitches and crashes in databases, identifying problematic trends, preventing faults, and even investigating and pinpointing their causes.

As Experda users, you can benefit from the close support of Experda’s NOC team, which will provide you with ongoing remote DBA services 24/7, monitoring and managing any suspicious events, handling glitches in real-time, and preventing faults in their formative stages.

Please note that some nuances and specifics of the language might be better understood in the native context. However, this translation should give you a clear idea of the content.

Key features

A central and intuitive graphical interface that displays all MS SQL servers and databases in one place.
Simultaneous maintenance and control rules/alerts for all servers.
24/7 monitoring and alerts.
Detection and release of deadlocks/blocks/locks.
Multi-Server Scripting: Giving the ability to run queries on multiple servers simultaneously.
The system is tailored for system administrators who may not necessarily have a background or experience as a DBA.
Easy alert configuration using predefined templates.
Performance tracking

Architecture Experda

Terminology

Client/Company

A client is synonymous with a company. A client manages multiple managed servers, usually monitored by a single agent, but it can be multiple agents. The client may be associated with multiple users.

Managed Server

Any client SQL Server instance that is added for monitoring in the UI is called a “managed server.” Managed servers are supported from version SQL 2005 and above.

Agent

The Experda Agent is a Windows service that is locally installed on a dedicated machine. The agent is responsible for the collection of all the data from the managed server machines and executing commands. The agent also behaves as a coordinator between the UI and the managed servers. For example, the user has selected to backup a database on a managed server. The agent will then create a backup task and monitor the progress until completion. If the agent’s service is disabled or not enabled and running, you will likely find that information is lacking and actions are not working in the UI.

If the agent is down, you will receive an alert.

For larger clients that wish to monitor more than 20 managed servers, an additional agent may be installed under the same client but on a different dedicated machine.

Multiple Agent Installations

Clients that have multiple domains spread over multiple physical locations will usually choose to install an agent per physical location. For example, if the client has servers in the USA and Europe, the client will most likely choose to install an agent on a dedicated server in the USA for the US servers and another agent on a dedicated server for the European servers.

OnPrem installation

A local agent + repository installation, which allows the client to operate completely within its internal network. This is a less common installation and is usually for enterprise companies.

Cloud installation (common)

A local agent installation using the main cloud repository system.

Installation and Setup

Prerequisites: Hardware and software requirements

Experda installs a single centralized agent in your business’s network. It is preferable that this agent is installed on a strong, dedicated server. This agent collects and analyzes a tremendous amount of information. Providing it with the appropriate resources will benefit the monitoring speed and give a better overall user experience.

There are a few exceptions to this rule: Some organizations only have one production machine, and they prefer to place the agent on that machine. Another exception is an organization that has an unusually large number of monitored servers, in which case we recommend creating several agents working under the same client, each installed on a dedicated server.

List of requirements for the dedicated server working on the OnCloud version:

Windows server 2019 and above
32G RAM
Minimum 4 Logical Processors
Verify that SQL Server Port id open between the agent to the managed servers
Open WMI default Ports between agent to the managed servers
Verify that RPC and WMI services are running on the managed servers
Open the https://www.experda.com/ domain and sub domain in the general firewall In case the installation is OnPrem, the repository prerequisites are:
Installed on the same machine as the agent
Windows server 2019 and above
32G RAM
SQL Server 2019 STD+
Enabled Full-Text Search at installation
Define at least 12G for the repository instance
Verify that SQL Server Port id open between the repository to the managed servers
Open WMI default Ports between the repository server to the managed servers
Verify that RPC and WMI services are running on the managed servers
Allow connection between user clients to the new site created on the repository server in order to open Experda’s UI directly through the browser

Step-by-step installation process

Step 1 – Sign up to Experda

Goto the Experda website and create a new account
Enter your work email and press continue. A verification email will be sent to you.

Enter the verification code and press “continue”.

After continuing, save your credentials for later usage.

Step 2 – Create a new company from the main dashboard

After successfully registering your details, you will arrive at your main dashboard, which will be empty. Press the button “create company” from the top right corner or alternatively from the top left main viewer options drop-down list.

An add new company dialog will appear. Enter the details of the company/client.

Select “Experda” as the partner.
Select your country from the support country If your country is missing, select another country and contact Experda to ask for your country to be added.
Select your timezone from the list.
Press “Create Company”.
Expect up to 60 seconds for the new company to be created and affiliated with your account.

Step 3 – Download agent installation file

From the dialog box, copy and paste the text for later usage.
Press “Download Agent” to download the agent installation file.

Save the agent installation file, as it is tailored specifically to your account.

Step 4 – Install the agent on a dedicated machine

Consult the agent installation prerequisites and hardware options before installation.
Download the file to the dedicated agent machine and follow the following installation instructions.

Step 1 – press “install”.

Step 2 – Check the “Launch” checkbox and “finish” to load the agent configuration. If you miss the “Launch,” you can always relaunch the agent configuration directly from the path: “C:\Program Files\ezCheetaCloud\CloudSetup\CloudInstaller.exe”

Step 3 – Enter service information credentials. These credentials will be used to run the agent as a local service. Make sure to provide credentials with administrative roles.

After pressing “Apply,” the following “Success” message should appear on the top left corner of the wizard.

Step 4 – Portal Information

The username, password and client id are built into the msi file itself and should appear automatically. Press “Next” to complete the configuration process.

After the installation, a new service will be added to the dedicated machine called “Experda Agent”.

Step 5 – Add a new Managed Server [see next section]

Add a new managed server

Return to the UI; your company will now appear in your company dashboard. Press it. Your company overview dashboard will appear. If this is a new company, the list of servers will be empty.

From the top left corner, press “add server” to add your first managed server. Allow up to 30 seconds for the list of servers in the local network to load.

From the list of “All Servers,” select your managed server. If the server does not exist in the list, it might be because the “SQL Server Browser” server is not enabled on the agent‘s server. In this case, you can press “Add server manually” to add the server details manually.

Select one or more servers from the provided list to add them to the list of pre-approved managed servers.

In the list of pre-approved managed servers, update the machine connection credentials and the database connection credentials by pressing the edit icon. For the database connections, you will need to provide system administrative credentials. For the machine connections, a user that is strong enough to read WMI/CPI and performance information from the machine. A domain administrator or local administrator should suffice.

Once the credentials are entered, you will see both “linked” green icons next to the connection, which means that the credentials are validated and the server may be added.

After all the servers you wish to add are pre-approved, you need to check the check box on the right side of the pre-approved list next to the server name and press the “choose servers and continue” to add them.

The new server will be added to the list of monitored servers.

Allow several minutes for the agent to start collecting information. We also recommend that you attach an alert template to the new server to start receiving server alerts.

You will also notice a small repository database has been added to your managed server called “Experda”.

Initial configuration and settings

After you have successfully installed the Experda agent and have add- ed your managed server(s) – now is the time to configure your alerts, collection intervals and maintenance.

A small repository will be installed.

Main Controls

Top Left Navigation Menu

The top left navigation menu provides oversight for your associated client views. The “View All” will send you to your Dashboard View. The client tree will also show you all your associated partners and clients. You can also add clients/servers directly from the menu.

*Legend*
1	Global search – will search all companies G servers
2	Reference to Dashboard View
3	Partner Companies Tree
4	Company selection
5	Add a new company

Server Tree

The server

Tree provides the full client -> server -> databases hierarchy. The server tree will only change when you change your active client/company.

Choosing a different entity (server/database) from the server tree will change the selected view on the right. For example, if you are in the alert setup view and you stay on a server in the server tree, only server-level alerts will be displayed. If you click on a database from the server tree, the view will change to display database-level alerts that are defined for the selected database.

Not all views support database-level views. For example, many of the performance views only support the server level.

*Legend*
1	Collapse/Expand all logical server groups
2	Reference to the company dashboard
3	Search filters. Available filters: • Severity • Tags • Server Type • Database Status • Database Recovery Models
4	Server Level. Press a server to see the Server Dashboard
5	Database level. Press to see the Database Dashboard

Alert Counter View

The alert counters show the total sum of active alerts of all alert types for the given level.

At the user dashboard level, it is the sum of all active alerts for all user-associated companies.

At the company level, it is the sum of all active alerts for all managed servers.

At the server level, it is the sum of all active alerts for the selected managed server.

The alert counter has different views. Some are inline, while others are panels:

Panel level counter view example:

Row inline counter view example:

Paging Control

The paging control is used for paging between large result sets of information. By default, each page has 10 results. You can see the full row count and page left or page right accordingly.

*Legend*
1	Choose the total rows per page; default is 10
2	Location of page in the total rows/pages. In the example, we are viewing rows 1-10 out of a total of 1784 rows
3	Go to previous page
4	Go to next page
5	The total Pages

Timeframe Control

The timeframe control dictates the overall time frame for all monitoring. It defaults to the last 10 minutes. However, in some modules, it is recommended that an extended timeframe of days, weefis and even a month is used.

The timeframe control is set to be “live” (end time always queried to the current time). You can also set it on history to ׳allow you to choose a start and end time to investigate historical events.

Timeframe view

*Legend*
1	Open timeframe control
2	Toggle between calendar view (default) and timeline view
3	Toggle between live view (default) and historical view
4	Easy access list to pre- determined common live timeframes
5	Press “Custom” in the list to specifically select a start date and time.
Note: In “Live” view we only select a start time frame, while the end will be the present.

Timeframe control Live view – timeline view

In the timeline view, you can choose the start and end times on a time- line control to investigate historical events easily.

Timeframe control History view

*Legend*
1	Easy access common time frames
2	Start date and time
3	End date and time

Dashboard View

The “Companies overview” view is the first to be reached after login. From here, the user may choose a particular company they wish to investigate. Once a company has been selected, the server tree will update to display the company’s server hierarchy.

The dashboard provides a higher-level overview of the alert status for every company, the number of servers, main services, and agent status.

Dashboard View – Default View

Legend

Top left menu

Time + timeframe control

Action button – create a new company

Alert Counter Control – total active alerts for all companies in view

Company view selection: Default, Gallery view, List view General search by company name

Additional filters for company view: status, company name, severity and company partner

A single company summary information panel

*Legend*
1	Company name + partner (if it exists)
2	Total active alerts for all managed servers.
3	Total SQL server services enabled/disabled.
4	Total active managed servers

Dashboard View – Gallery View

A more simplified version allows less information per company but more companies displayed in a single view.

Dashboard View – List View

Company / Client Dashboard

Main Company’s Managed Server View

Once an active company has been selected, this view will provide an oversight of the company’s servers. From here, you can easily recognize things like too many critical alerts, high levels of CPU, disabled or inactive servers and much more.

This is the middle level before recognizing and investigating other potential threats or performance issues.

Legend
1	Alert Counter Indicator
2	Total Server Count
3	Total Database Count (in all Managed Servers)
4	Main Managed Server Table

Main managed Server Table

Column Name	Description	Tooltip
Alert Counter Indicator	Fast visible indicator to see if the managed server has active alerts and their severity level
Host	The name of the host machine	The version of the operating system
Instance	The name of the instance: if the instance is the default instance, it will be called MSSQL	The version of MSSQL
Alerts	The number of active alerts from each severity level (Information, Warning, Error)	The version of MSSQL
CPU	The Host’s latest machine CPU	Tooltip shows a chart of the last 10 minutes
Memory	The Host’s latest machine used RAM	Tooltip shows a chart of the last 10 minutes
ID	The Host’s last sum of IO operations	Tooltip shows a chart of the last 10 minutes
Wait	The last MSSQL wait type + wait time
Up Since	The last instance restart time
SQL Service	The status of the main SQL service. Recommended value – “Active” in an active environment
SQL Agent	The status of the agent service. For MSSQL Express, this value is irrelevant. If the value is not Active, that means the agent is not running, and jobs/mainte- nance operations are not being executed.
Open Server Overview	Pressing this will redirect you to the managed server dashboard, selecting the row’s managed server

Client Dashboard Additional Actions/Filtering Options

Pressing the “Filters” button on the top right corner of the main managed server table will open up the filter menu:

Group filtering – Filtering by the managed server’s server group
Server – Filter by server name. Status – Filter by servers that have active alerts from a selected list of statuses.
Active/Inactive – servers that are active only, inactive only or both.

Pressing the “Search” textbox and enter will filter out the server list by name.

Server Dashboard

The server dashboard holds higher-level information for the entire server. From the dashboard, you may be redirected to other, more specific sections like performance or security.

Server Dashboard – Part 1

Legend

Server storage summary, including the last status of the disks, how full they are and their daily growth.

An overview of the storage of physical disks/partitions on the managed server’s machine.

For each partition, the following information is displayed:

Total size
Total used size
Partition Name
Daily Growth in
The Data/Log/Other distribution within the

Actions: Pressing “More info->” will redirect to “Performance->Disk“

An overview of the Security status of the managed server. On the left – the last health score.On the right – the total of suspicious activities that exist in the system. Actions:Pressing the “more info” arrow-> redirects to Server Security

Server Dashboard – Part 2 – Performance Overview

Section	Image	Description
Performance Overview – left	Action buttons: CPU arrow -> redirect to Performance-> CPU Memory arrow -> redirect to Performance->Memory Disk Latency arrow -> redirect to Performance-> Disks	The main machine CPU chart according to the selected time The main machine used Memory/RAM for the selected time Disk latency for all partitions for the selected time range.
Performance Overview – right	Action buttons: PLE arrow -> redirect to Performance->Memory Network arrow -> redirect to Performance->Network Waits arrow -> redirect to Performance->Waits	Description

Database Summary Dashboard

The database dashboard gives an oversight of the selected server’s databases. From here, we can identify potential disk issues, missing backups and overactive databases.

Legend

Main Database Grid

Name	Description
Database Name	The name of the database
Status	The last status of the database. Database status may be: ONLINE – Database is available for access. The primary filegroup is online, although the undo phase of recovery may not have been completed. OFFLINE – Database is unavailable. A database becomes offline by explicit user action and remains offline until additional user action is taken. For example, the database may be taken offline in order to move a file to a new disk. The database is then brought back online after the move has been completed. RESTORING – One or more files of the primary filegroup are being restored, or one or more secondary files are being restored offline. The database is unavailable. RECOVERING – Database is being recovered. The recovering process is a transient state; the database will automatically become online if the recovery succeeds. If the recovery fails, the database will become suspect. The database is unavailable. RECOVERY PENDING – SQL Server has encountered a resource-related error during recovery. The database is not damaged, but files may be missing, or system resource limitations may be preventing it from starting. The database is unavailable. Additional action by the user is required to resolve the error and let the recovery process be completed. SUSPECT – the primary filegroup is suspect and may be damaged. The database cannot be recovered during the startup of the SQL Server. The database is unavailable. Additional action by the user is required to resolve the problem. EMERGENCY – The user has changed the database and set the status to EMERGENCY. The database is in single-user mode and may be repaired or restored. The database is marked READ_ ONLY, logging is disabled, and access is limited to members of the sysadmin fixed server role. EMERGENCY is primarily used for troubleshooting purposes. For example, a database marked as suspect can be set to the EMERGENCY state. This would permit the system administrator read-only access to the database. Only members of the sysadmin fixed server role can set a database to the EMERGENCY state.
Recovery	The databases’ recovery modules: SIMPLE – No log backups Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space. FULL – Requires log backups. No work is lost due to a lost or damaged data file. Can recover to an arbitrary point in time. BULK LOGGED – Requires log backups. An adjunct of the full recovery model that permits high-performance bulk copy operations. Reduces log space usage by using minimal logging for most bulk operations.
Transactions per second	A chart of the last 10 minutes indicating total transactions per second (AVG). This will indicate how “busy” the database is.
Data	Total Data/Log distribution for the database
Total size	Database total size in MB
Compatibility	The database compatibility level
Last log backup	Last backup date for log type backup
Last data backup	Last backup date for data type backup

Filter options for database view
You may filter by the available recovery modules, database status and available compatibility levels.

Shared Performance Modules

Queries Grid

The queries grid displays queries according to the timeframe control
The queries displayed are grouped by
The origin of the queries is the MSSQL’s This is the same reason why the queries are without actual parameters and displayed as “prepared statements.” The statistic is only recorded after the end of the execution.
The queries grid also shows statistics of stored procedure executions. The row for stored procedure execution is colored yellow, and the text begins with “SP:”.

The queries grid exists in the following modules:
CPU Module – ordered by CPU descending
Memory module, Disk Module – ordered by Logical Reads descending
Queries Module – ordered by Duration descending

*Legend*
1	Standard table paging control
2	“View Details” button will open the full query text

Queries Grid Main Table

Column	Description
Time	The time of the snapshot
Last execution	Last execution time of the query
Execution count	Total executions for the given query type for the timeframe
Duration	The sum in ms of the total durations for the given timeframe
CPU usage	The sum in ms of the total CPU for the given timeframe
Details	The first 200 characters of the text. This column is similar to the “TextData” column in the MSSQL profiler. The tooltip will present extra characters. To see the full query, press the “view details” button on the right side.
Physical Reads	The total physical reads for the given query. Physical reads are when there is a transfer of information from the SSD/H-disk to the memory and not when information is read from within the memory.
Logical Reads	The sum total of logical reads. This means a page that was found in the MSSQL cache and not brought from the SSD/H-disk.
Logical Writes	The sum total of logical writes.
Total Rows	The sum total of rows that were affected by the operation. In a select statement, the sum total of rows that were brought back to the client.
CLR Time	The sum total of time the query spent in CLR (associated with CPU performance).
Database	The original database this query was derived from. The database context depends on the active database the connection was using at that time. For example, in the management studio, the active database can be changed. The active database can also be changed in the connection string. Note: Notice that some applications do not use the active database and use full qualifiers from ‘master’.
View Details	A button that opens a window with the full text of the query.

Enhanced Charts

*Legend*
1	Chart Title
2	Save the filter/grouping button to change the display. See Filter/Grouping Options below
3	Compare button – not active in all Enhanced Charts
4	Alert indicators Active in CPU and Memory This means that an alert was raised at the given time. Only Warning and Error alerts are Will usually correlate to If not correlated, it means the alert configuration values were recently altered. Note: Alert indicators only exist if main timeframe range is under 4 hours
5	Alert lines Alert lines as defined in the alert configurations for the given machine/server. Only exists if the timeframe is under 4 hours (raw data analysis).

Physical Reads	The total physical reads for the given query. Physical reads are when there is a transfer of information from the SSD/H-disk to the memory and not when information is read from within the memory.
Logical Reads	The sum total of logical reads. This means a page that was found in the MSSQL cache and not brought from the SSD/H-disk.
Logical Writes	The sum total of logical writes.
Total Rows	The sum total of rows that were affected by the operation. In a select statement, the sum total of rows that were brought back to the client.
CLR Time	The sum total of time the query spent in CLR (associated with CPU performance).
Database	The original database this query was derived from. The database context depends on the active database the connection was using at that time. For example, in the management studio, the active database can be changed. The active database can also be changed in the connection string. Note: Notice that some applications do not use the active database and use full qualifiers from ‘master’.
View Details	A button that opens a window with the full text of the query.

Enhanced Charts

*Legend*
1	Chart Title
2	Save the filter/grouping button to change the display. See Filter/Grouping Options below
3	Compare button – not active in all Enhanced Charts
4	Alert indicators Active in CPU and Memory This means that an alert was raised at the given time. Only Warning and Error alerts are Will usually correlate to If not correlated, it means the alert configuration values were recently altered. *Note: Alert indicators only exist if main timeframe range is under 4 hours*
5	Alert lines Alert lines as defined in the alert configurations for the given machine/server. Only exists if the timeframe is under 4 hours (raw data analysis).

Enhanced Chart Filter/Grouping

*Legend*
1	Compare button – opens the option to compare the current timeframe with a previous day/week in the same performance metric context.
2	Moves timeline forward/backward.
2	Opens/Closes the filter/ grouping dialog.
2	Opens the Enhanced Chart Date Grouping Options (see below)
2	Opens the Enhanced Chart Grouping Function Options (see below)
2	Enables/Disables the alert lines.
2	Save/Cancel user selection.

Enhanced Chart Grouping Date Grouping Options. The grouping options will be enabled in correlation with the time range of the main timeframe.

None – there is no grouping.
Minute – the grouping is by the minute.
Hour – the grouping will be by the hour.
Day – the grouping will be by the day.
Week – the grouping will be by the week.

Note: If the time frame is more than 4 hours, the information brought from the repository will be brought from aggregated tables.

Enhanced Chart Grouping Function Options By default, a grouping function is selected per

performance metric. For example, CPU will be AVG, and Duration will be SUM.

However, you may select a grouping function by yourself here.

Note: In some charts, like CPU, “Sum” aggregation won’t appear.

Compare Graphs Feature

The compare graphs feature allows the user to compare selected graphs sharing the same timeline manually. For example, if the user wishes to see the PLE chart with memory or the CPU chart with database transactions per second.

These tools will improve your ability to identify and solve performance bottlenecks.

*Legend*
1	Compare Graphs button – press to get the pop up “compare graphs” dialog.
2The first/primary graph to compare
2	Press the “Add graph” button to add additional secondary graphs to compare.

Performance CPU View

The CPU performance view is one of the most important views in the system. It allows the breakdown of CPU-related events, multiple instances of the same machine events, processes affecting the CPU and more.

Use this view to monitor and identify performance issues that are directly related to CPU-intensive systems.

CPU – Part 1

*Legend*
1	CPU Usage – for the selected instance. This is the last sqlservr.exe process CPU for the given instance.
2	CPU Usage – for Machine. The total last CPU usage for the given machine. The default sample interval is every 60 seconds and can be changed in Settings.
3	CPU Usage Enhanced Chart. Shows all MSSQL instances for the same machine. Alert lines and alert indicators are associated with the “Machine SQL Process CPU Over Threshold” alert configuration type.
4	Server total CPU Usage Enhanced Chart. Follows the server’s total CPU over the selected timeframe.
5	Top processes. The AVG CPU time for the selected timeframe of the top OS processes.
6	Top cores. The AVG CPU usage for the selected timeframe for the machine’s cores. Core names as given by the OS.

CPU – Part 2

*Legend*
1	CPU Waits – Chart of MSSQL wait events that are associated with CPU stress: CXPACKET – Caused when parallelized query threads have an unequal workload, causing a block on the entire query until the longest-running thread is complete. See help for further discussion. SOS_SCHEDULER_YIELD – Caused when a task voluntarily yields the scheduler for other tasks to execute. During this wait, the task is waiting for its quantum to be renewed. CXCONSUMER – CXCONSUMER itself does not directly consume CPU resources. Instead, it represents a wait state where the Consumer thread is idle, waiting for data. However, high occurrences of CXCONSUMER can indicate that the system is heavily utilizing parallelism, which can lead to high CPU usage overall due to the large number of parallel threads being executed
2	Database CPU usage. This list is created directly from the statistics of the queries that are correlated with default databases. The CPU usage data is calculated as a ratio of the total. The sum of all values will always be 100%. This indicator will help recognize the most active databases.
3	Queries Grid – sorted by default by CPU

Performance – Memory View

The memory performance view allows the breakdown of memory (RAM) related events, multiple instances of the same machine events, processes affecting the memory and more.

Use this view to monitor and identify performance issues that are directly related to memory-intensive systems.

Memory – Part 1

*Legend*
1	Allocated memory usage of the instance. This will show the latest instance memory allocation from the machine total.
2	Used memory/RAM for the entire server. This will show the latest collected value.
3	Enhanced Chart of all instances on the selected instance’s machine over time with alert values for the selected time range.
4	Enhanced Chart of the server used memory/RAM allocation for the selected time range.
5	The OS top processes AVG of used RAM (working set). AVG is calculated for the selected time range. The list is ordered by the most significant consumer to the least.
6	Sorted list of databases that consume the most amount of buffer cache. This list is manufactured from the queries themselves, not from the distribution of the buffer cache. In addition, this is a ratio–based list, meaning if you sum all database values, you will reach 100%.

Memory – Part 2

*Legend*
1	Wait control sorted by wait events that are associated with memory consumption: CMEMTHREAD – Caused when a task is waiting on a thread-safe memory object. The wait time might increase when there is contention caused by multiple tasks trying to allocate memory from the same memory object. RESOURCE_SEMAPHORE – Caused when a query memory request cannot be granted immediately due to other concurrent queries. Long waits and wait times may indicate an excessive number of concurrent queries or excessive memory request amounts. RESOURCE_SEMAPHORE_MUTEX – Caused when a query waits for its request for a thread reservation to be fulfilled. The wait also occurs when synchronizing query compile and memory grant requests. RESOURCE_SEMAPHORE_QUERY_COMPILE – Caused when the number of concurrent query compilations reaches a throttling limit. High waits and wait times may indicate excessive compilations, recompiles, or uncachable plans. RESOURCE_SEMAPHORE_SMALL_QUERY – Caused when a memory request by a small query cannot be granted immediately due to other concurrent queries. Wait time should not exceed more than a few seconds because the server transfers the request to the main query memory pool if it fails to grant the requested memory within a few seconds. High waits may indicate an excessive number of concurrent small queries while the main memory pool is blocked by waiting queries. SOS_RESERVEDMEMBLOCKLIST – Caused when internal synchronization in the SQL Server memory manager occurs.
2	PLE Graph over a selected time range. Page life expectancy (PLE) -Indicates the number of seconds a page will stay in the buffer pool without references. The recommended value of the PLE counter is (updated: minimum of) 300 seconds. I have seen this value on busy systems to be as low as 45 seconds and on an unused system, as high as 1250 seconds. Page Life Expectancy is the number of seconds a page will stay in the buffer pool without references. In simpler words, if your page stays longer in the buffer pool (area of the memory cache), your PLE is higher, leading to higher performance as every time a request arrives, there is a chance it may find its data in the cache itself instead of going to the hard drive to read the data.
3	Queries control – sorted by logical reads in descending order.

Performance – Disk/Partitions View

Use this view to recognize disk issues in IO-intense systems.

Disks/Partitions – Part 1

*Legend*
1	Main partition distribution for the server showing total used vs. total size.
2	Partition selection: by default, all partitions are selected. This allows a focus on a specific partition.
3	Enhanced Chart of Disk latency of timeframe. Disk latency is the time that it tafies to complete a single I/O operation on a blocfi device. For hard drives, an average latency somewhere between 10 to 20 ms is considered acceptable (20 ms is the upper limit). For solid-state drives, depending on the workload, it should never reach higher than 1-3 ms. In most cases, workloads will experience less than 1ms latency numbers.
4	Enhanced Chart of Disk Queue Length. The Disk Queue Length counter shows you the average number of read and write requests that were queued on the selected physical disk. The higher the number, the more disk operations are waiting. It requires attention if this value frequently exceeds a value of 2 during peak usage of SQL Server.
5	Sorted list of databases that consume the most amount of disk activity. This activity is taken directly from the OS activity of the database files.
6	A list of the most active database files. The activity is the sum for the given time range.

Disks/Partitions – Part 2

*Legend*
1	Enhanced Chart of disk read activities for the selected time range.
2	Enhanced Chart of disk write activities for the selected time range.
3	Queries Control sorted by logical reads for the selected time range.
4	Enhanced Chart for IO waits that are associated with IO operations: ASYNC_IO_COMPLETION – Caused when a task is waiting for an asynchronous I/O operation to finish. IO_COMPLETION – Caused when waiting for I/O operations to complete. This wait type generally represents non-data page I/Os. Data page I/O completion waits appear as PAGEIOLATCH_* waits. PAGEIOLATCH_DT – Caused when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Destroy mode. A long wait time may indicate problems with the disk subsystem. PAGEIOLATCH_EX – Caused when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Exclusive mode. A long wait time may indicate problems with the disk subsystem. PAGEIOLATCH_KP – Caused when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Keep mode. A long wait time may indicate problems with the disk subsystem. PAGEIOLATCH_NL – Used for informational purposes only. PAGEIOLATCH_SH – Caused when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode. A long wait time may indicate problems with the disk subsystem. PAGEIOLATCH_UP – Caused when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Update mode. A long wait time may indicate problems with the disk subsystem.
5	Enhanced Chart of disk storage over the selected time range.

Performance – Network View

Legend

Enhanced Chart of network activity over time. Received, Send, Total network activity in bytes.

Wait Control – wait events that are associated with the network for the selected time frame:

ASYNC_NETWORK_IO – Caused when network writes are blocked behind the network. Ensure the client is processing server data.

Performance – Blocking View

In SQL Server, blocfiing occurs when one session holds a locfi on a specific resource and a second SPID attempts to acquire a conflicting lock type on the same resource. Typically, the time frame for when the first SPID locks the resource is small.

Long transactions occur when a single session opens a transaction and never commits or rolls back the transaction. When DML operations occur during an open transaction, they might create a lock on a specific resource, thus escalating to a block. In most applications, transactions are quick and should not stay open for long.

Experda tracks transactions longer than 3 minutes and blocks longer than 30 seconds.

Note: The default Experda Agent sample interval for blocking is every 60 seconds. This means that if your block occurred in between samples, it might not appear in the results.

Blocking – Part 1

Legend
1	Total unique blocking sessions found for the selected timeframe
2	Longest blocked session wait period for the selected timeframe.
3	Total long transactions for the selected timeframe.

Blocking – Part 2

The blocking details grid shows the tracked blocking and long open transactions for the selected timeframe.

Legend

Filter selection defaults to view all types. Other options: Blocking, Open Transactions

Main results grid – the blocking/long transactions details table shows in detail the blocking sessions or blocked sessions as a hierarchy. If the block is a blocking type, you can expand the row to see the complete hierarchy of blocked sessions.

Name	Description
SPID	Blocking/Long transactions session ID
Type	Either “Open Transaction”/”Blocking”
Status	The last status of the blocked session
Login Name	The session’s login
Wait time	Total wait time
Start time	Query start time
Duration	Total lock duration
Query	The locking query/batch. Hovering will display the tooltip with some more text. To see the full text, hover on the row and press the “view query” button at the end of the row on the right.
Host	The session’s host
Application	The calling application
Database	The database that was in the context of the session at the time of the block
Last Batch	The time of the session’s last query execution

In the example, we can see all the sessions that are blocked and their blocking parents.

Blocking by application

Blocking by application module helps recognize the most troublesome top applications that are causing the most blocks.

Legend
1	Top applications by AVG block time
2	Top application by total block time
3	Top application by total block count

Performance – Deadlocks View

A deadlock occurs when two or more tasks permanently block each other because each task has a lock on a resource that the other tasks are trying to lock. For example:

Transaction A acquires a shared lock on row 1.
Transaction B acquires a shared lock on row 2.
Transaction A now requests an exclusive lock on row 2. It is blocked until transaction B finishes and releases the shared lock it has on row 2.
Transaction B now requests an exclusive lock on row 1. It is blocked until transaction A finishes and releases the shared lock it has on row 1. Transaction A cannot complete until transaction B completes, but transaction B is blocked by transaction A. This condition is also called a cyclic dependency: Transaction A has a dependency on transaction B, and transaction B closes the circle by having a dependency on transaction A.

Both transactions in a deadlock will wait forever unless an external process breaks the deadlock. The SQL Server Database Engine deadlock monitor periodically checks for tasks that are in a deadlock. If the monitor detects a cyclic dependency, it chooses one of the tasks as a victim and terminates its transaction with an error. This allows the other task to complete its transaction. The application with the transaction that terminated with an error can retry the transaction, which usually completes after the other deadlocked transaction has finished.

In each deadlock, there is a single winner and one or more victims that are killed by the system.

Legend

Main Deadlock List – according to the selected timeframe. Notice that if you don’t see deadlocks, it might be because the time range is too short.

Column

Description

Time

The deadlock time

Victim SPID

The session ID of the victim

Host

The host of the winner session

Application

The application of the winner session

Database

The calling database of the winner session

Text Data

The winning query. To see the full query, press #6

Show Details

Once pressed, it will show the Deadlock Chart (#5), allowing for further analysis

DeadlockExpand Button – the button expands the selected deadlock and shows the 4 grid

Show Details button – shows the Deadlock Chart

Inner Victim Grid. Once 2 Is pressed, the victim grid is displayed. For most occurrences, there will be a single victim.

Column	Description
Process SPID	The victim’s session ID
Host	The host of the victim session
Application	The victim’s calling application
Database	The victim’s originating database (from query)
Login	The victim’s login
Log Used	The log used during the transaction
Lock Type	The victim’s lock type of the locked object

The deadlock chart

The victim expand button – press to see process details on the right panel (#8, #9)

The winner expand button – press to see winner process details on the right panel (#8, #9)

Expanded process details

Column	Description
Host	The process originating host
Application	The process calling application
Database	The process database (from query)
Login	The login that started the process
Log Used	The total log used during the transaction
Deadlock Priority	Priority of the deadlock from the SQL engine
Wait Time	Total wait time before deadlock detection and elimination
Mode\Type	Process lock type of locked object
Isolation Level	Session isolation level at the time of deadlock
Transaction

The deadlock chart

*Legend*
1,3	Victim lock mode
2	Victim expand button (expands panels on the right with victim’s details)
4	Victim object lock
5	Winner object lock
6,8	Winner lock mode
7	Winner expand button (expands panels on the right with winner’s details)

Performance – Queries View

The chart will portray the top 5 batch/queries according to the selected performance metric.

*Legend*
1	Select a performance metric to receive the top 5 “worse” queries. Default is by “Duration”. Other options:
2	Chart type selection. Default selection is Bars. Other options:
3	Results of charts displaying top 5 query activity sorted by #1 over the selected time range.

Standard query control. Default sorting is by duration.

Performance – Temp DB View

Temp DB – Part 1

*Legend*
1	Enhanced Chart of Total read activity over the selected time range of all TempDB files.
2	Enhanced Chart of Total write activity over the selected time range of all TempDB files.

Temp DB – Part 2

Legend

TempDB table count distributions. There are five types: Global temp, regular temp, user table, system tables, and internal tables.

Enhanced Chart for wait events caused or affiliated with TempDB database:

PAGELATCH_DT – Caused when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Destroy mode.

PAGELATCH_EX – Caused when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Exclusive mode.

PAGELATCH_KP – Caused when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Keep mode.

PAGELATCH_NL – Used for informational purposes only.

PAGELATCH_SH – Caused when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Shared mode.

PAGELATCH_UP – Caused when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Update mode.

SLEEP_TEMPDBSTARTUP – Caused when a task waits for TempDB to complete startup.

Performance – Waits View

Main waits chart

Main wait event Enhanced Chart distributed by the main wait categories: IO waits, Network waits, Backup waits, CPU waits, Non-Page Latch waits, Lock waits, Transaction Log waits, Tempdb waits.

Wait Category Grid

Legend

Wait event filters:

Filter by category name
Filter by category type
Filter by description

Wait event category expand button (Opens up expanded view) and grid

Column	Description
Category	The main wait categories are IO waits, Network waits, Backup waits, CPU waits, Non-Page Latch waits, Locks waits, Transaction Log waits, and TempDB waits.
Percent	Percent of wait from total in activity at sample time.
Wait Count	Total wait events for the selected timeframe.
Total Waits (ms)	Total sum wait duration (ms) for the category in the selected timeframe.

Wait Category Detailed Grid

Displays the detailed wait events for the selected wait category.

Column	Description
Type	Type of wait event.
Percent	Percent of wait from total in activity at sample time.
Wait Count	Total waits for the type for the timeframe.
Total Waits (ms)	Total sum wait duration (ms) for the detailed wait event in the timeframe.

Performance – Services View

Legend

Services grid. This grid is directly affected by the selection of the timeframe.

Column	Description
Service Name	OS service name
Display Name	OS service display name
Status	his is the last status according to the timeframe. Status can be Running or Stopped.
Running Chart	A chart that shows when the service was running or stopped.
Account	Name of the account running the service.
Service Mode	Manual/Automatic
Path	The executing file path
Change Status	he button that allows you to stop/start the service

Paging control

Services filters.

Filter by service
Filter by service logon
Filter by service executing file name or path.

Button that opens another dialog that allows you to stop/start the service remotely.

Performance – Processes View

Legend

Main process grid display + aggregation function selection (Average, Max, Min). The default is Max. Pay attention to the timeframe.

Column	Description
Service Name	OS service name
Display Name	OS service display name
Status	This is the last status according to the timeframe. Status can be Running or Stopped.
Running Chart	A chart that shows when the service was running or stopped.
Account	Name of the account running the service.
Service Mode	Manual/Automatic
Path	The executing file path
Change Status	Button that allows you to stop/start the service

Standard paging control

Process grid filters:

Filter by process name
Filter by command line for the process (directory/executing file name). – Command line information only exists for some processes

Security – Health Check Report View

Health Check Reports

The detailed analysis of the security score for a given server. The analysis is done by running multiple health check tests (full list in appendix). Each test is either successful or a fail. The final score is the weighted average of the successful tests.

You can change the tests that run on each health check report by going to the health check report settings. In addition, you may exclude unimportant tests from the user interface directly from the result grid.

Health check reports are run either manually or automatically every week for every server.

Note: Any excluded health check test will continue to be excluded in future reports

Legend

Health check options:

Export PDF – will create and export a detailed list of the last health check report (displayed #6).
Health check settings – will transfer to health check settings.
Check Now – immediately runs a health check analysis. Allow up to 30 seconds for the health check report to complete and be displayed.

Security score of the last health check report.

A chart of security score over time to show your health progress.

Total risks found in the current health check report.

Total Unexcluded risks in the current health check report.

The results of the current health check report

Column	Description
Property	The name of the health check test
Status	The status of the test: Success/Failure
Value	In some tests, there is an additional value. For example, the test “service status” recommended status which tests the status of each SQL service status. The value may be the agent, browser services or other SQL services on the same machine.
% Of Total Score	The weight of the score of the test from the total.
Details	A further description of the result of the test.
Tip/Advice	Recommendation for what to do in case this test has failed.

Health Check Report Settings

The health check settings are similar to the alert settings in the way that predefined templates are used to activate specific health check tests.

From the health check report settings, you may disable certain tests that you don’t want to run.

It’s important to note that once a single test has been excluded from the report page, it will be excluded automatically from all future health check reports that run on the same server.

There are two types of health check report tests – for OS and SQL servers. Thus, there are two types of health check report templates as well. You may create as many templates as you wish. However, only a single template of each type may be attached to a server.

Any new affiliation to a machine or a server will remove the affiliation from the previous template and attach it to the new template.

After affiliating your machine/server to a template, you may customize specific servers/machines by choosing the “custom setup.”

How do I find the health check report settings?

Creating a new health check report template

There are two ways to create a new health check report template:

To create a new blank template, press the “[+]Create Template” button. A create new template dialog will Choose the new template name and its type.
The second method (recommended) will be to start with an existing template. Pressing the (…) on the top right corner duplicates the template. A dialog will appear where you will need to enter the duplicated template’s name.

Affiliating a server with a template

To affiliate a server or machine with an existing template, click on the template name and go to the “Affiliated machine” tab (if the template is a SQL type, it will say “Affiliated server”). Then press the “Add server” button and choose your machine/server.

To remove the machine or server, just click on the X next to its name or deselect it from the list of affiliated machines/servers.

Customizing a server to exclude/include tests

By default, your server will be attached to a template. If you wish to customize a specific server not to include specific health check tests, you first choose the “Custom Setup.” Remember that this option will unlink the specific test from the template. You can always undo this.

From the custom setup:

Find your server from the tree list and click
Find the relevant If it’s a machine-level test, it will be found in the Host tab. If it’s a SQL-level test, it will be found in the SQL tab.

Customizing a server to exclude/include tests

From the custom setup:

Find your server from the tree list and click it.
Find the relevant If it’s a machine-level test, it will be found in the Host tab. If it’s a SQL-level test, it will be found in the SQL tab.
Disable/Enable the selection. This will cause the test to be unlinked from the template, which you can then undo by pressing the “reset” button next to it.

Security – Suspicious Activities View

The suspicious activity is limited to a total of 50,000 records per server. The purpose of this module is to help identify activities from users or applications that are found to be suspicious and unwelcome in your environment.

At first, all records will be of identified, common sources. You should eliminate known applications and logins by adding them to the whitelist.

By default, the Experda agent runs two separate background profiler sessions that filter by the whitelist. These sessions do not cause an overhead as they are capped at 50,000 records. They will only continue after the user has whitelisted entities.

Adding a login or an application to the whitelist will automatically delete all its records and free the suspicious activities module to continue searching for new suspicious activities.

The suspicious activity module creates two separate trace files that it monitors all the time. The more whitelisted logins/applications there are – the more efficient the trace file search becomes and less of a burden on the system.

Your module activates automatically upon adding a new managed server. However, you may cancel it at any time by disabling it (see 4 from the diagram below).

This module does not correlate to the main timeframe control.

Note: The suspicious activities will automatically stop running after 50,000 activities have been collected and will continue upon adding applications or Logins to the white list.

Legend

Total suspicious activities found for the given server. Note if this number reaches around 50,000 activities, the module will automatically hold until objects are added to the whitelist, which then releases storage until 50,000 records are reached again.

It is ideal to have a minimum number of suspicious activities by “cleaning” and adding objects to the whitelist from the suspicious activities settings.

Total suspicious applications found (not whitelisted)

Activities of logins grouped by day and logins.

Settings button – redirect to the suspicious activities settings to add/remove objects from the whitelists.

Total suspicious logins found (not whitelisted)

Total databases where suspicious activities have been found.

Detailed Results Grid of all suspicious activities found.

Column	Description
Additional options	Add to whitelist – will add the login name and delete all suspicious activities for the login. Copy details – copies the details of the transaction to the clipboard. Delete – will delete the row.
User name	Name of login
Date time	Time of activity
Application	Application name originating the activity
Database	Database context used at the time of the activity
Details	The details of the query

Filtering the suspicious records. Possible filters are by login name or application name. Filtering will also affect the grouping.

In addition, there is a global search textbox to find specific logins/queries/ applications easily.

Suspicious Activity Settings

The suspicious activity module is built around the whitelists. There are two whitelists: Applications and Logins.

To make the best use of the suspicious activity module, review the whitelists weekly and remove any non-threatening users or applications that you recognize.

Main Suspicious Records Settings – Records View

Legend

Filter by logins/applications. Default is filter by user.

Search textbox to find logins/applications easily.

Results grid

Name	Description
User Name/Application Name	Filter by logins/applications. Default is filter by user.
Row count	Total found suspicious activities.
Move to whitelist	Action button – sends the selected login/application to the whitelist and deletes all records from the suspicious activities.

Main Suspicious Records Settings – Whitelist View

Legend

Toggle the page between the suspicious records view and the whitelist view.

Enable/Disable the entire module for the selected server. This module is activated by default.

Multiple selected entities “add to whitelist” action.

All whitelisted logins list

Name	Description
User Name	The name of the whitelisted login
Add to blacfilist button	Removes the selected login/user from the whitelist.

Add to blacklist buttons that remove the selected entity from the black list.

The whitelisted application list

Name	Description
Application	The name of the whitelisted application
Add to blacfilist button	Removes the selected application from the whitelist.

Capacity Planning View

Capacity planning modules assist in making proper long-term storage strategies. With this module, we can easily visualize non-standard growth patterns in our disks, directories, databases or files.

The default sampling interval for the capacity planning agent samples is daily.

Note: By default when using this module aim to have a longer time range selected (a week or above).

Legend

Disk/partition related information. The collected disks are of above 5G. For each disk, the following information is displayed:

Disk Name
Current disk space used (latest)
Disk total growth for the selected For example, if the selected timeframe is “last week,” the total growth was a negative 30.68% from the beginning of the week’s sample.
Progress bar visual indicator to see how much free space is Total Disk capacity

Disk important OS directory size distribution. For each directory, we can see the total size and the growth for the selected timeframe.

By default, the directories that are sampled are backup directories or mdf ldf file locations.

Capacity Planning – Part 2

The prediction visualization chart:

The prediction chart is split into two parts. The first part (on the left) is created from actual data that has been sampled. The second part (on the right after the separator line) is the predictive part. The series is then drawn by using a dotted line to show the prediction.

Each prediction is based on the previous growth using basic linear mathematic equations.

Use this module to decide the proper time for an enlargement of disk space.

As with the rest of the capacity planning module – we suggest using long timeframes when using this module, as the sampling intervals are daily.

Chart options are depicted in the legend below.

*Legend*
1	Disk predictions chart – show the current disk used space growth vs expected upcoming growth.
2	Features/grouping options for prediction charts. See fully explained below. Percent Of Use Vs. Percentage Of Growth Prediction Size From Chart Prediction Grouping
3	Database growth prediction chart. Shows the total size of the actual database vs. expected growth.

Percent Of Use Vs. Percentage Of Growth

Percent of use – the total percent from the usage. For example, if C:, the first sample is 10G, and the second sample is 15G, the percentage growth will be 50%.

Percent of growth – the growth in actual values (see Y axis on chart). For example, if C:, the first sample is 20G, and the second sample is 21G. The growth will be of 1G.

Prediction Size From Chart

The size of the prediction part from the total size of the chart.

Prediction Grouping

The prediction grouping goes hand in hand with the selected timeframe. If your selected timeframe, for example, is for last month, selecting a week range will group results by week, giving approximately four points for each series in the left part of the chart.

Selecting a day range will result in approximately 30 dots for each series on the left of the chart.

Always On / High Availability

The High Availability module is built around always on capabilities of SQL server. This module will display all associated always on information for the currently selected server.

The information is displayed by Availability Groups and allows drill down into each group separately.

The Availability Groups will be displayed for the selected server whether it acts as a secondary or primary server in the availability group.

Each Availability Group is comprised of 2 or more servers, one of which is the primary server, which are then comprised of the databases themselves.

Availability Group List Overview

*Legend*
1	Availability Group List – by default all groups are collapsed. To open an active group use the icon.
2	Active Availability Group – When an Availability Group is Enabled (has no issues) you may collapse or expand on it to view it’s database sync information (see next section: expanded availability group
3	Search + Filters. Use the filters to focus on a specific group \ database or search for a particular state. Failover Mode – Automatic \ Manual Health Status – the health status of the group is comprised of the health status of the underlying databases.

Expanded Availability Group

Legend

Status over time – displays the health status changes of the group over the course of the selected timeframe. An empty “status over time” when having a “healthy” availability group is a good indicator.

Sync Latency chart – displays gaps in sync latency over the selected timeframe. The closest the chart series are to 0 the better.

Database list – the internal database list the comprises the availability group split into servers.

Name	Description
Server Name + Role	Name of the server and it’s role in the AG. Can be primary or secondary
Failover Mode	The failover mode of the server replications. Can be Manual or Automatic
Sync Databases	The accumulated value of how many databases are healthy and synced in the server
Details	Provides any additional information about the current status
Last Commit Time	The last commit time of the database within the syncronization
Redo-Queue Size	The redo queue size is the size of transaction logs between its last_received_lsn and last_redone_lsn. last_received_lsn is the log block ID identifying the point up to which all log blocks have been received by the secondary replica that hosts this secondary database.
Latency (MS)	The delay that happens between the syncronization process of the databases within the availabilty group.

Jobs

Experda has 2 levels of job views:

The first level is the company overview, which allows you to see all jobs across all servers.
The second level is the server job.

From either view – press on the job name to retrieve the job history from the managed server msdb database.

Jobs Company Overview

Legend

To reach the company job overview, press the “company overview” on the top left of the main server hirerechy control and then press “jobs”.

The main job table specifications

This table is updated every 60 seconds from each active monitored server.

Name	Description
Job Name	The name of the job
Server Name	The MSSQL server that runs the job
Enabled	Wether the job is enabled or disabled
Run status	The last running status of the job
Last duration	The last running duration time of the job
Last run	The last time of the job
Next run	The estimated next time the job will run (for scheduled jobs)
Scheduled	Wether the job is scheduled or not
Created	The time the job was created
Modified	The time the job information was modified
Description	A description of the job
Owner	The name of the login that owns the job
Start step	The first step of the job
Steps	The total steps of the job
Status	The current status of the job

The Job Filters. Use these filters / search capabilities to find particular jobs.

Jobs Server Overview

The job server overview provides a server filtered job header information. The displayed table and filters are similar to the Job Company Overview table and filters.

Specific Job History

To see the job history of a specific job, double click the job name in the job’s tables or press the 3 dots on the left of each job row. The job’s history will be retrieved from msdb of the managed server and will depend on the purging policy of the managed server’s msdb (usually 30 days back are reserved).

Legend

The selected job’s name + latest status

This main job history instance table. Each row in this table is a single job occurance, which can end in a successful execution of all the job’s steps, partial execution or failed execution.

From each row you can drill down to the details of the job occurance.

Name	Description
Run Datetime	The time of the beginning of the job occurance
Duration	The total duration of the job occurance
Status	The status of the job occurance
Steps	The total steps that ran in the job occurance

tA drill down of each step in the job occurance.

Name	Description
Step name	The name of the step
Status	The end result of the step’s execution.
Run time	The start time of the step
Duration	The total duration of the step until end result has been reached
Message	A detailed message. Usually the TSQL output of the steps.

Alerts

The alerts module assists in the prevention of errors and allows the creation of alerts on multiple levels. It allows us to automatically detect unanticipated outliers, anomalies, and errors and create watchdogs that alert a user or a group of users in multiple ways.

The alerts are split into two general types of alerts, and each server needs to be monitored for both types. For a full list of supported alerts, please review Appendix A – list of supported alerts.

OS Alerts – any operating system information-related alerts. The following are alerts that are based on OS-related information. The categories in OS Alerts are Health Check, Logical Drives, Memory, Monitor, Process Performance G Processor/CPU. For example, alert “CPU above threshold.”

SQL Alerts – all alerts that are directly related to SQL Server. The categories in SQL-related alerts are Always on, Backups G Recovery, Database Properties, Database Files, Database Monitoring, Database Properties, Health Check, Monitor, Scheduled Operations, and Server Diagnostics. Inside SQL Alerts, some alerts relate directly to the level of the database, while others are at the level of the whole server. For example, alert “Database has no backup”.

Understanding alert attributes

Active alert – an alert may be active until it automatically becomes inactive or is deleted by the user. For example, if we have the alert “CPU above threshold” set to be raised on the CPU at 20% and the CPU is at 25%, the alert will be raised and activated. However, if the CPU is reduced to 15%, the alert will still be displayed, but not as active.

Dismissed alert – You may choose to mark an alert as “automatically dismissible.” In that case, if the alert conditions are no longer met, the system will automatically dismiss the alert. It will remain in your history, but not active. The user may manually dismiss an alert as well.

Alert Templates

When using the alert module, we highly recommend using alert templates and attach/affiliate the template to each server. The attached template will immediately associate all template alerts with the attached server and will save you a lot of time.

The defined alert configurations percolate from the template level to the server level and to the database level.

An alert can then be customized for a database or server. By customizing an alert on the server or database level, we actively “detach” the alert from the template. To understand more about alert customization, see the next section.

Alert Dashboard View

The alerts display is always accessible from the top right corner when you are staying on a server in the server tree. Press the triangle icon to open the alert drawer.

Alert Drawer Display

*Legend*
1	Opens the alert, full view to view the alerts in a table with more information.
2	Toggle between seeing only the active alerts view and all alerts.
3	Toggle between seeing only the active alerts view and all alerts.
4	Sorting options: default is by date. You can also sort by severity.
5	The alert details (title, date and description).

Alert full view

Legend

Toggle between seeing all alerts and active alerts.

Close alert view

Severity total counter control

Global search inside the alerts by entity and text and

alert filters, allowing filter by alert category, status and read status.

Alert table

Name	The alert name (full alert list can be found in Appendix A)
Status	The alert may be active, inactive or dismissed. If the alert is active, it means that it is ongoing, and the conditions for it to be raised are still enabled. The alert can be dismissed by the system if the alert configuration is marked as allowed to dismiss the alert when the conditions for the alert occurrence stop occurring. means that it is ongoing, and the conditions for it to be raised are still enabled. The alert can be dismissed by the system if the alert configuration is marked as allowed to dismiss the alert when the conditions for the alert occurrence stop occurring.
Date created	The date the alert was first raised.
Date closed	The date of the time the alert was dismissed – in case the alert was dismissed.
Dismissed by	The user that dismissed the alert (for auditing purposes).
Category	The alert category.
Server	The server that raised the alert.
Database	In case of a database-level alert – the database name.
Description	Additional information about the alert.

Configuring alerts

How to get to the alerts configuration

From your selected company, select the “settings item” icon from the top left menu. Then, choose “Alerts” from the left menu.

Alert Configuration Views

Alert Configuration general view

Legend

A selection between the “Template View” (Default) and the Custom Servers View.

Create a new template button.

Change tabs between the SQL/Host view, which displays all the alert configurations, to the Affiliated servers/hosts view.

Navigation to a specific category.

Total alert configuration counter + search for specific alert configurations

Alert configuration row view

*Legend*
1	Alert name + sum of total included/excluded entities.
2	Short description of the alert for all severity levels.
3	Enable/Disable alert.
4	Press to configure the alert

Enable/Disable an alert configuration. Notice that if you choose to enable an alert configuration from the row, it will automatically receive the default options. To further configure, you need to use the 8 to see all options.

Affiliated Server View

From the affiliated server view, we affiliate or detach a server from a template. This immediately associates all of the template’s alert configurations with the attached servers.

If the server had previously been attached to another template of the same type (Host/SQL), all linked alert configurations will automatically be removed. All unlinked/custom alert configurations will stay untouched as they are considered to be specifically altered to serve the purpose of the user.

Note: when we detach a server from a template all unlinked or customized alert configurations are not removed.

*Legend*
1	The affiliates server tab. Press “Affiliated Server” to get to the affiliated server view
2	Action to open the “Add remove dialog.” Select the servers you wish to “detach” or “add” to the tem- plate from this dialog and press “Add servers.”
3	A quick list to see the existing attached servers. You can easily detach a server from a template by pressing the “X” next to its name.

Alert Configuration view

*Legend*
1	Alert Configuration Name. Action “Reset to default” -> will reset the configuration values to the original recommended default values. For example, Critical value will change to 20%. Action “Unlinked” -> Pressing the X will restore the alert with its parent’s configuration. Action “Enabled” – pressing will enable or disable the alert.
2	Rule Details For some alerts, there are no additional parameter configurations. Alerts like “Database has no backup,” for example, has to “escalation.” They are boolean alerts, and in that case, a single drodown list with the severity level will be displayed In other, more complex alerts, three severity levels will be visible, each one completely customizable to allow maximum flexibility. At least one severity level needs to be selected. Once an alert has become active, it will stay active as long as it is still at a defined severity level. For example, if an alert on CPU that is defined on information level above 20% and critical above 50% goes under 20%, it will remain unread but inactive. Persist over option will determine if the alert is checked for persistance. For example, if by default in the rule shown on the left CPU goes over 30%, the alert will automatically be raised. However, if the value of 120 seconds is selected in the “presist over” selection, the rule will not be immediately raised but will wait until the CPU is over 30% for a consecutive time of 120 seconds. The sampling is done every 60 seconds by default. Automatically Dismissed – use this option to automatically dismiss alerts that have “fixed themselves” and are no longer important. In our example, if this option is selected and there is an active alert for the CPU, and the next CPU value is under 20% (minimum “information” severity level), the alert will automatically dismiss itself.
3	Notification/Messaging Options Panel Use this panel to select the action that occurs after an alert is raised for the first time. By default, you may see the alert from the alert dashboard. However, you may also receive an SMS alert and/or an email alert. Use this panel to choose your email/SMS recipients and the severity level. You may choose a single contact or contact groups to allow maximum flexibility.

How to create a new alert template

There are two ways to create a new template:

Press the [+] Create Template button to open the new alert template Fill in the new template name and select its type (OS = machine-level alerts, SQL = SQL service-level alerts). Then press the “Create template” button.
You can stay on an existing template and duplicate This is recommended if you wish to make minor changes to an existing template instead of creating a new one.

Alert Customization – Understanding “Linked” And “Unlinked” Alerts

The proper usage of the alert module is by creating alert templates and affiliating them with your servers at the Host and SQL levels.

However, there are times when we wish for an alert to either be defined differently or customized for a specific machine/server/database.

When we use the custom options, we “unlink” the alert from the template by choosing separate alert parameters or messaging configurations. This alert will now not be affected when we change its corresponding alert configuration in the template itself. The alert at the database level may be unlinked from the server level, and the alert from the server level may be unlinked from the template level.

Any change in the custom options of the alert will unlink it from its parent. For example, adding an email to the notification, changing its severity level or setting a specific alert parameter value.

For example. Server A is associated with Template A, which has the “CPU Above” alert set to alert when the CPU level rises above 50%. If we customize the alert in template A to a value of 60%, as long as server A is linked to the template, the alert value will change for Server A as well to 60%. However,

if we customize the alert value specifically for Server A to 70%, the alert will then be unlinked from the server, and any changes to it on the template level will not affect it but will affect all other associated servers to the template.

The alert will then be visible as “unlinked” at the custom server level. Example:

How to return an unlinked alert to be linked to its parent.

Any customized alert will automatically be marked as unlinked if the entity has a linked affiliated parent. To return the customized alert to the template, we can press the X on the unlinked mark from the view or from within the alert details themselves.

We can also “re-link” an alert from the main view by pressing the “X” next to the unlinked alert.

Alert Customization – understanding “includes” and “excludes”

The alert module uses a hierarchy of templates-> servers-> databases. As servers may be excluded/included from a template and databases may be excluded/included from a server – we will use the word “entities” to generally describe either a server or a database with its respective association with its parent.

We will sometimes wish to exclude certain alerts from being checked on certain entities. For example, if my main template, Template A, is affiliated with 20 different servers, and I want a specific server not to be included for a particular alert – I will exclude that server from that alert. The same logic works between a custom server and its affiliated databases.

Excluding certain servers from a template allows for any other new servers that are added to have the same logic as the template and will not affect the excluded servers. When adding a new database, for example, the server’s alerts will automatically be inherited by the new database, as the assumption is that all monitored databases should be monitored in the same way unless customized or excluded.

However, including has a different logic. When a user chooses to include a certain server, all other servers will automatically be excluded, even new servers that are added. The same logic works for databases. When the user chooses to include a specific entity, the assumption is that the alert is then “tailored” for that entity, and no other entities should be associated with it. That also means that any newly added entities will not be automatically affiliated with the template/server unless specifically added to the list of included entities.

Examples:

My template “Default SQL” has the following alert defined: “Always On – Database Not ” However, two of my servers are not defined as “Always On,” I will want to exclude all “Always On” alerts for those two servers.
My server has many databases that are not active, and I don’t back them up. In this case, I will exclude all inactive databases from the test “Database No Backup Exists.”
My server has 50 databases, but only 3 of those databases are priority databases that I’d like to also check for index I will choose the “Database Fragmented Index” alert from my server (custom level) and include only those 3 databases.

How to exclude a server from an alert configuration (template level)

Select your template and open the configuration for the selected
Inside the alert dialog, select “Exclude,” and from the dropdown options, select “Exclude”
Select the “Choose” dropdown menu to open a new dialog to select the servers from a Choose the servers you wish to exclude from the template and press “Add Servers.” From this dialog, you may remove or add the excluded server.

How to exclude a database from an alert configuration (custom server level)

Select your server from the custom servers and open the configuration for the selected alert.
Inside the alert dialog, select “Exclude,” and from the dropdown options, select “Exclude”
Press the left textbox to display the list of available databases Choose the databases you wish to exclude from the template and press “Save.” From this dialog, you may remove or add the excluded databases later on.

How to include a database from an alert configuration (custom server level)

Select your server from the custom servers and open the configuration for the selected alert.
Inside the alert dialog, select “Exclude,” and from the dropdown options, select “Include.” Remember that the included database logic works differently from the excluded logic. See “Alert Customization – understanding includes and excludes” to understand more.
Press the left textbox to display the list of available databases. Choose the databases you wish to include and press “Save.” From this dialog, you may remove or add the excluded databases later on.

Alert Customization – specific database type selection from template level

For alerts that are directly affiliated with a database, there is another automatic filtering option that is derived directly from their template. As you cannot exclude “databases” directly from a template, this feature allows you to associate a database to template alerts via their properties from the template level.

Remember that this rule will apply to all affiliated servers and their corresponding databases.

Examples:

We wish to run the alert “No backup found” on all databases except for the system databases.
We wish to have “Database No Backup Exists” for Log-type backups to all databases that have a full recovery

Note: Only database-appropriate alerts will have this feature enabled at the template level. Examples of database level alerts: “Database Old Restore”, “Database No Backup Exists”, “Database Object Lock”, “Database Status”

Name	Description
All DBs	All databases (default)
System DB	Only system databases: Master, Experda, Model, MSDB, TempDB
Full DB	All databases that have a full recovery model.
Simple DB	All databases that have a simple recovery model
All Without System	All databases except for the system databases

How to set the database type from the template level

Goto your SQL type template and open your database level alert
Press the “Excludes” tab, and from the first dropdown list, select the database grouping you wish to affiliate with the alert. After selection, press “Save.”

Finding your way around the alert configuration module

List Of Alerts (Appendix A)

Alert statistics

Alert Viewing/Indicators

Settings

The alert settings will only be available once an active server has been selected.

To reach the settings, select the settings icon from the top left menu.

Company Profile View

Legend
1	The company logo. The logo is displayed in different locations across the UI.
2	The name of the company. This is displayed in the main dashboard, the server hierarchies, reports and throughout the UI.
3	The company’s HQ country location.
4	The company’s HQ country location.

Membership and Permissions View

User Permission Roles

Owner – A powerful user on the level of a single client/company. They are allowed to do everything on the level of the client, including user management and payments.

SysAdmin – Can do everything at the level of the client/company except handle payments. Only SysAdmin privileges and above may change permissions.

Editor – Can edit all servers for the client/company except those that are implicitly denied.

Viewer – Can watch/view all servers for the client/company except those that are implicitly denied.

Coordinator – This user has specific privileges given to them on the action or server level.

Members View

Legend
1	Choose between the member’s view and the Group’s view.
2	Total count of contacts in the active view
3	Invite member action. This will invite a new member and give them permissions.
4	Members view table view
5	Member permission role
6	Additional options for a member. See edit access to view the full list of servers this member/group has access Add to group – Add this member/contact to a specific Remove from team – deletes the member from this team.

Members edit access panel – Servers view

Legend
1	Name + email of the selected member
2	Toggle between the server view and the group view
3	The list of managed servers for which the member has permissions.
4	Action button to remove the member completely

Members edit access panel – Groups view

Legend
1	Name + email of the selected member.
2	Toggle between the server view and the group view.
3	Remove an active member from the group.
4	Action button to remove the member completely from the team.

Invite Members View

Legend
1	Full name of the invited member
2	Email address of the invited member. The invitation will be sent to this email.
3	The country/timezone for the invited member.
4	The phone number of the invited member. This phone number will be used to send notifications from the system. It will also be used to authenticate the user.

Groups View

Legend

Panel title

Toggle between the Group view and the Member view

Action to Add Group

Main group table

Name	Description
Group Name	The name of the group
Users	Total count of users in the group
Permission	See Roles
Type	See Member Types

Delete group action

Create a new group dialog

Legend
1	Group Name
2	Add/remove employees in the group
3	See Roles

Contacts Settings View

Contacts View

Legend

Toggle between Contacts view and Group View

Add new contact dialog

Contact table

Name	Description
Name	The name of the group
Type	The type of contact (in the future, this may have different options).
Status	Active/Inactive
Email	The contact’s email. This is used for authentication and notification purposes.
Phone	The contact’s phone number. This is used for authentication and notification purposes.
Groups	A list of groups the contact is in.

Groups View

Legend

Toggle between Contacts view and Group View

Add new contact group dialog

Contact Group table

Name	Description
Group Name	The name of the contact group
Contacts	The total number of contacts in the group
Status	Active/Inactive Toggle
Additional options	Edit – open the edit contact group dialog Remove – deletes the contact group

Add contact dialog

Legend
1	The first name of the contact to be added.
2	The last name of the contact to be added.
3	The email of the contact.
4	The contact’s phone number.
5	Toggle if the contact is active or not.
6	Save button.

Add contact group dialog

Legend
1	The name of the new contact group
2	The contacts we wish to add to the group.
3	Active/Inactive toggle.

Performance Settings View

The performance settings is similar to the alert settings in the way that it uses predefined templates to enable/disable specific sampling and to change their sampling intervals (in seconds).

There are two types of performance modules–OS and SQL servers. Thus, there are two types of performance setting templates. You may create as many templates as you wish. However, only a single template of each type may be attached to a machine/server.

Any new affiliation to a machine or a server will remove the affiliation from the previous template and attach it to the new template.

After affiliating your machine/server to a template, you may customize specific servers/machines by choosing the “custom setup.”

How do I find the performance settings?

Creating a new performance setting template

There are two ways to create a new performance setting template:

To create a new blank template, press the “[+]Create Template” A create new template dialog will appear. Choose the new template name and its type.
The second method (recommended) is to choose an existing template, press the (…) on the top right corner and duplicate the A dialog will appear where you will need to enter the duplicated template’s name.

Affiliating a server with a template

To affiliate a server or machine with an existing template, click on the template name and go to the “Affiliated machine” tab (if the template is an SQL type, it will say “Affiliated server”). Then press the “Add server” button and choose your machine/server.

To remove the machine or server, just click on the X next to its name or deselect it from the list of affiliated machines/servers.

Customizing a server to have different performance sampling properties

Note: Disabling any sampling module will result in having incomplete alerts, charts or informatiom from the user interface. Do this only by recommendation from Experda experts.

From the custom setup:

Find your server from the tree list and click
Find the relevant If it’s a machine-level test, it will be found in the Host tab. If it’s a SQL-level test, it will be found in the SQL tab.
Disable/Enable the selection or change the sampling time for the server/ machine and press “Save.” This will cause the test to be unlinked from the template, which you can then undo by pressing the “reset” button next to it, which will return it to its template

Global Settings View

SMTP settings

The SMTP settings may be used in the OnPrem installation and the OnCloud version. The SMTP server needs to be accessible via the internet to be used. Once SMTP settings have been set and tested, all notifications and reports will be sent via the SMTP server.

Legend
1	SMTP server IP/address + port.
2	Toggle SSL.
1	SMTP Server User name + password.
2	The sender account that will appear as the sender of the notifications/reporting.
1	Testing email address.
2	Clear all dialog controls.

Experda Maintenance Overview

How does it work?

The Experda Maintenance Plan Generator provides a comprehensive suite of automated maintenance solutions tailored to meet your database management needs. Maintenance actions can be executed directly from the server object tree or scheduled through the maintenance settings for streamlined operations.

Within the maintenance settings, it is recommended to utilize maintenance templates to group similar servers for consistent management. For instance, all production environment servers can be grouped under a single “Production Template,” which facilitates more frequent backups and scheduled index rebuilds on a weekly basis.

For more granular control, you can define custom server maintenance rules to address specific requirements for individual servers. If a server remains linked to a template, you can either exclude it from a specific rule or create a custom rule that overrides the template configuration.

Scheduled maintenance rules are implemented using SQL Server Agent jobs. To ensure compatibility across all MSSQL editions, including Express editions, a specialized solution is employed. For Express editions, maintenance rules are executed via Windows Task Scheduler, offering alternative scheduling options supported by the Windows operating system.

Manually executing a maintenance rule

To manually execute a maintenance action:

Select Databases: From the server list in the main company view, select one or more databases where you wish to apply the maintenance action.
Single Database Maintenance:
1. Select a single database from the menu and click on the options (represented by three dots).
2. Choose from available actions, such as Restore or other Immediate Actions, to view and execute the desired maintenance tasks.
Multiple Server Maintenance:
1. a. For actions across multiple databases, select the checkboxes corresponding to the databases you want to include.
2. Access the three-dot menu at the server level to apply maintenance actions to all selected databases one after the other.

Supported Maintenance Rules

The following is a list of maintenance rules supported by Experda, each offering a comprehensive set of configurable options that will be detailed later. For optimal database performance and reliability, it is recommended to implement these rules effectively:

Summary

Backup: Safeguards against data loss by creating recoverable copies of the database. Essential for disaster recovery and restoring data integrity following unexpected events.

Reindex: Mitigates index fragmentation caused by frequent data modifications, optimizing query performance and reducing IO overhead for efficient resource utilization.

Update Statistics: Refreshes metadata used by the query optimizer, ensuring accurate execution plans and improving query efficiency.

Integrity Check: Executes comprehensive database integrity validations (e.g., DBCC CHECKDB) to detect corruption and maintain structural reliability and health.

Shrink Database: Reclaims unused storage space. Should be used sparingly, as it may lead to fragmentation and impact performance.

Restore Database: Facilitates recovery by restoring a database to a consistent state, ensuring minimal downtime and maintaining business continuity after failures.

Maintenance Template View

Reach the maintenance template view by pressing the settings button and choosing Maintenance G Backup.

Legend
1	Selection Between Maintenance Template View and Custom Server View Both views serve distinct yet complementary purposes for managing configurations. The Template View enables users to manage and apply predefined maintenance templates across multiple servers efficiently, ensuring consistency in configurations. Conversely, the Custom Server View provides detailed visibility and control over individual server configurations, allowing for granular adjustments tailored to specific requirements.
2	Choose between affiliated servers and the rules tabs.
3	3-Dot Menu – Additional Options for a Template The 3-dot menu provides advanced options for managing templates. These options include: Set as Default Designates the selected template as the default for new server configurations. This ensures the template’s rules are automatically applied to servers added without prior configuration if selected after a new server is added. Duplicate Template Creates a new template by copying the rules and settings from the selected template. This is useful for quickly creating variations of existing templates. Edit Template Opens a popup dialog with detailed options for modifying the template’s rules and settings to align with specific requirements. Delete Template Deletes the selected template along with all associated linked server rules. Note that this action does not affect unlinked or custom server rules, which remain intact.
4	Create template button – opens a create template dialog
5	Search template by name
6	Navigating Between Backup and Optimization Sections Easily toggle between the Backup and Optimization sections within a template. This feature is particularly useful when managing templates with numerous rules, allowing quick access to relevant sections without scrolling through extensive configurations.
7	Summary + inner rule textual search
8	Add new backup rule button
9	Rule Row Features A rule row in the maintenance configuration interface provides several interactive options and functionalities for efficient rule management: Inline Rule Details Displays critical information about the rule, ensuring easy identification and understanding of its purpose and scope. Toggle Switch Enables or disables the rule for all associated servers. Disabling a rule automatically removes the corresponding MSSQL job from the associated servers. Copy Script Allows the user to copy the rule’s underlying script or procedure call to the clipboard for reuse or manual execution. Duplicate Rule Creates an identical copy of the existing rule, allowing for quick replication and customization. Edit Rule Opens the rule’s summary dialog, enabling detailed editing and fine-tuning of rule settings to match specific requirements. Delete Rule Prevents deletion of the predefined core rules (the basic four). When deleting a rule, all unlinked server-level rules will automatically transition to independent custom rules.
10	Add new optimization rule. Choose from reindex, integrity or update statistics.

Maintenance Template View – Affiliated Servers

Legend
1	Navigation between Affiliated servers or Rules
2	The Add Server button opens the server selection dialog, allowing users to choose and add or remove servers to the template. The final list of selected servers is confirmed within the dialog before association.
3	This list displays all servers currently linked to the template, providing a clear overview of server associations for efficient management. Pressing the X on a server will detach it from the template.

Note:
If a server is detached from a rule, all linked rules associated with the server will be removed. However, any unlinked or custom rules will remain intact and unaffected by the detachment.

Common options for maintenance rules

DB Source selection

Maintanance Rule Settings – Source Selection

Within the settings for each maintenance rule, you will find the Source Selection option. This setting allows you to fine-tune the selection of databases or database groups to which the maintenance rule should be applied.

By default, the following database groups are predefined and can be applied at both the Maintenance Template and Custom Maintenance levels:

Source Database Type
All databases	The rule will apply to all databases
System databases	The rule will apply to all system databases (msdb, model, tempdb, master)
User databases	The rule will apply to all user databases (all databases except the system databases)
Availability group databases	The rule will apply to all databases that belong in an availability group.
User databases except availability group databases	The rule will apply to all user databases except those in availability groups.
Full databases	All databases that have are in full recovery mode
Simple databases	All databases that have are in simple recovery mode

Source Database Type in Template and Custom Maintenance Levels

1. Template Level:

The Include/Exclude section is used to include or exclude specific servers from a rule.
This functionality ensures that rules are applied only to the relevant servers, as multiple servers can be attached to a single template.

2. Custom Server Level:

The Include/Exclude section allows for granular control by including or excluding specific databases from a rule.
This flexibility is not available at the template level, as templates manage multiple servers sumultaneously.

3. Additional Custom Server Settings:

In the Custom Server Level, the Include/Exclude tab may offer additional configuration options depending on the rule type.
For example, in the Integrity rule, you can select specific objects to include in the rule.
Detailed settings for each maintenance rule are available within their respective configuration dialogs.

Maintanance Schedule

Scheduling options UI

Scheduling Options for Maintenance Rules

1. Familiar Interface:

The scheduling options are modeled after the MSSQL Job Scheduler to provide a familiar look and feel for seamless integration.

2. Flexible Scheduling:

You can attach one or more schedules to any maintenance rule, enabling precise control over execution. You can do that by pressing the
For Backup Rules, it is recommended to apply distinct schedules to each rule to optimize performance and resource utilization.

3. Position in Rule Execution:

The scheduling step is always positioned as the penultimate step in any maintenance rule workflow, ensuring all preceding configurations are processed before scheduling.

4. Edition-Specific Scheduling Types:

Different scheduling types are available for Express Edition Templates and Express Custom Rules, accommodating the unique requirements of these environments.

Schedule Management Features

Open Schedule Configuration Dialog
Enables the creation of a new schedule for the selected maintenance rule by opening the configuration dialog.
View Configured Schedule Details
1. Displays the specifics of the existing schedule.
2. Includes Edit and Delete options for modifying or removing the schedule as required.

One time schedule

One-Time Schedule Configuration

Schedule Name
Assign a descriptive name to the schedule for easy identification.
Select One-Time Schedule Type
Choose the one-time schedule option to execute the maintenance rule at a specific time.
Set Starting Time
Specify the exact time for the one-time schedule to initiate.

Daily schedule

Daily Schedule Configuration

Daily Time Schedule Type
Specifies the type of schedule to be executed on a daily basis, defining both the timing and frequency of maintenance rule execution.
Daily Frequency
Determines the interval at which the maintenance rule will run during the day.
Daily Frequency Options:
1. Single Occurrence:
  Executes the maintenance rule at a specific time during the day, such as 9:00 AM.
2. Recurring Occurrence:
  Executes the maintenance rule multiple times within the day, e.g., every 2 hours, starting at 10:00 AM and ending at 3:00 PM.
Activation and Deactivation Times:
Start At: Specifies when the schedule becomes active.
End At: Specifies when the schedule will automatically deactivate.

Weekly Schedule

Weekly Schedule Configuration

Weekly Schedule Configuration
Defines a schedule to execute maintenance rules on specific days of the week.
Weekly Recurrence Interval
Determines the number of weeks between each execution (e.g., every week or every two weeks).
Select Applicable Days
Choose the days of the week when the schedule will be active.
Daily Frequency Options:
1. Specific Time Execution:
  Executes the maintenance rule at a designated time, such as 9:00 AM.
2. Recurring Intervals:
  Executes the maintenance rule repeatedly within a day, e.g., every 2 hours from 10:00 AM to 3:00 PM.
Activation and Deactivation Times:
1. Start Time: Specifies when the schedule will activate.
2. End Time: Specifies when the schedule will automatically deactivate.

Monthly Schedule

Monthly Schedule Configuration

Monthly Schedule Type
Configures a schedule to execute maintenance rules on a monthly basis.
Monthly Occurrence Options:
1. X Specific Day of the Month:
  Executes the rule on a specific day of the month, such as the 3rd day of every second month.
2. X Complex Occurrence:
  Executes the rule on a specific day and week combination, such as the Second Friday of every third month.
Daily Frequency Options:
1. Single Time Execution:
  Executes the rule at a specific time during the day, e.g., 9:00 AM.
2. Recurring Intervals:
  Executes the rule multiple times within a day, e.g., every 2 hours starting at 10:00 AM and ending at 3:00 PM.
Activation and Deactivation Times:
1. Start At: Specifies when the scheduler will activate.
2. End At: Specifies when the scheduler will automatically deactivate.

Express Scheduling options UI

One time schedule

One-Time Schedule Configuration

Schedule Name
Assign a descriptive name to the schedule for easy identification.
Select One-Time Schedule Type
Choose the one-time schedule option to execute the maintenance rule at a specific time.
Set Starting Time
Specify the exact time for the one-time schedule to initiate.

Daily Schedule

Daily Schedule Configuration

Schedule Name
Assign a descriptive name to the schedule for easy identification.
Daily Time Schedule Type
Specifies the type of schedule to be executed on a daily basis, defining both the timing and frequency of maintenance rule execution.
Daily Frequency
Determines the interval at which the maintenance rule will run during the day.
Daily Frequency Occurrence::
Executes the maintenance rule multiple times within the day, e.g., every 10 minutes.
Activation and Deactivation Times:
1. Start At: Specifies when the schedule becomes active.
2. End At: Specifies when the schedule will automatically deactivate.

Weekly Schedule

Weekly Schedule Configuration

Schedule Name
Assign a descriptive name to the schedule for easy identification.
Weekly Time Schedule Type
Specifies the type of schedule to be executed on a weekly basis, defining both the timing and frequency of maintenance rule execution.
Weekly Frequency
Determines the interval at which the maintenance rule will run weekly.
Days of the week
Determines which days of the week the schedule will run.
Daily Frequency Occurrence:
Executes the maintenance rule multiple times within the day, e.g., every 10 minutes.
Activation and Deactivation Times:
1. Start At: Specifies when the schedule becomes active.
2. End At: Specifies when the schedule will automatically deactivate.

Monthly Schedule

Monthly Schedule Configuration

1. Schedule Name
  Assign a descriptive name to the schedule for easy identification.
2. Monthly Time Schedule Type
  Specifies the type of schedule to be executed on a weekly basis, defining both the timing and frequency of maintenance rule execution.
3. Monthly Frequency
  Determines in which months the schedule will be active.
4. Days of the month
  Determines which days of the month the schedule will run.
5. Days of the month occurrence
  Determines day G week frequency. For example, runs on the 1’st and 2’nd week of the month on Sunday.
6. Daily Frequency Occurrence
  Executes the maintenance rule multiple times within the day, e.g., every 10 minutes.
7. Activation and Deactivation Times:
  1. Start At: Specifies when the schedule becomes active.
  2. End At: Specifies when the schedule will automatically deactivate.

Database Backup

Backup maintenance rule

Purpose:

To ensure that critical data is protected and can be recovered in case of data loss, corruption, hardware failure, or disaster.

Why It’s Important:

Backups allow you to restore your database to a consistent state after an unexpected issue.
Regular backups (full, differential, or transaction log) help achieve Recovery Point Objectives (RPOs) and Recovery Time Objectives (RTOs).
Without a proper backup strategy, there is a significant risk of data loss that could be catastrophic to business operations.

Backup Summary Quick Access + Include Exclude

Backup Rule Configuration Summary

Backup Rule Summary
Displays an overview of the configured backup rule, including its key settings and status.
Include/Exclude Backup Rule Tab
Allows granular control over which databases or servers are included or excluded from the backup rule.
Multi-Tab Backup Plan Navigation
If the backup plan includes multiple backup types (e.g., DATA, LOG, DIFF), a multi-tab interface will be available to navigate between the individual backup rule configurations.
Configure Rule
Opens the Rule Wizard for additional configuration options and advanced settings.
Cancel/Rollback Changes
Reverts any unsaved changes made to the backup rule setting.
Save Updated Options
Commits the changes made to the backup rule configuration.

Include Exclude backup rule tab

Include Exclude Configuration Options

Close the Dialog
Closes the configuration dialog, discarding any unsaved changes.
Toggle the Rule
Enables or disables the rule. Disabling the rule will stop its execution and remove it from associated servers.
View Common Database Source Selection
See the common database source selection settings to fine-tune the databases or database groups included in the rule.
Include/Exclude Settings:
1. Template Level:
  The Include/Exclude section is used to include or exclude specific servers from the rule.
2. Custom Server Level:
  – The Include/Exclude section allows for granular control, enabling the inclusion or exclusion of specific databases.
  – his functionality is unavailable at the template level due to the association of multiple servers with a single template.

Backup rule wizard step 1 – Backup type

Backup Plan Configuration

Select Backup Types
1. Choose from three available backup types (Full Data Backup, Transaction Log Backup, and Differential Backup) to include in the backup plan.
2. During the subsequent steps of the wizard, a dedicated tab will be created for each selected backup type, allowing detailed configuration of individual backup rules.

Backup rule wizard step 2 – Main options

Backup Rule Configuration

Backup Destination:
1. Directory:
  The most common option. Specify a local or network directory for storing backup files.
2. URL:
  Available only in Custom Server Rules. Enables uploading backups directly to Azure Cloud Backup Storage.
Backup Root Directories:
1. Define custom directory to chose a specific directory on the destination server, this option only exists in the custom server rule.
2. If instance default is specified, the SQL Server default backup directory is used.
3. Select the Global Directory Setting to use the globally configured backup directory.
Backup File Retention Period:
1. Specify the time (in days or hours) after which backup files should be deleted.
2. If no time is specified, backup files will not be deleted.
3. Database backups include a verification check to ensure that transaction log backups newer than the latest full or differential backup are not deleted.
Old Backup File Deletion:
1. Choose when to delete old backup files:
  – Before Backup: Delete old backup files before performing the new backup.
  – After Backup: Delete old backup files after completing the backup and verification. If the backup or verification fails, no files are deleted (default behavior).

Specify Database Order:
Define the order in which databases will be backed up:

Value	Description
NULL	Default: Databases are backed up in the specified order, then sorted by database name in ascending order.
DATABASE_NAME_ASC	Sort by database name in ascending order.
DATABASE_NAME_DESC	Sort by database name in descending order.
DATABASE_SIZE_ASC	Sort by database size in ascending order.
DATABASE_SIZE_DESC	Sort by database size in descending

Navigation Controls:
Use navigation buttons to move between configuration steps or tabs for backup types.

Backup rule wizard step 3 – Additional options

Backup Rule Advanced Options

Toggle Options: Configure additional settings to customize the backup behavior:
1. Init: Specify whether the backup file should overwrite existing files.
2. Verify Backup: Verify the integrity of the backup after completion.
3. Checksum: Enable backup checksums for error detection.
4. Copy-Only: Perform a copy-only backup, which does not affect the sequence of conventional SQL Server backups.
5. Compress: Compress the backup file. If unspecified, the backup compression default in sys.configurations will be used.
6. Format: Specify whether a new media header should be created.
7. Directory Check: Verify if the backup root directory exists before executing the backup.
8. Read/Write Filegroups: Backup the primary filegroup and any read/ write filegroups.
Database Selection – READ_ONLY / READ_WRITE:
Choose which databases to include based on their access mode:
1. All Databases: Include both READ_ONLY and READ_WRITE databases (default).
2. Read-Only Databases: Include only databases marked as READ_ ONLY.
3. Read/Write Databases: Include only databases marked as READ_ WRITE.
4. The is_read_only column in sys.databases is used to determine the database type.
Number of Backup Files:
Specify the number of backup files to create. The default is equal to the number of backup directories specified, with a maximum limit of 64 files.
Backup Description:
Provide a description for the backup to document its purpose or any relevant details.
Navigation Controls:
Use navigation buttons to move between configuration steps or review backup settings.

Backup rule wizard step 4 – Directory & File Structure

File Format Name Options

Custom File Name Format:
Define a unique file name format for backup files. Options include:
1. Custom Format: Create a personalized naming convention.
2. Availability Group File Name Format: Use standardized formats for availability groups.
3. Directory-Based Format: Apply naming conventions based on the directory structure.
Directory Structure Control:
To manage the directory structure for backups, configure settings in Global Settings.
Format Customization:
1. Selecting any file name format option will open a popup dialog.
2. Use predefined constants or create custom constants to modify the file name format as required.

File Name Format Customization

Available Constant List
Provides a list of predefined constants that can be used to structure the file name format (e.g., <ServiceName>, <ClusterName>, <Year>).
Selected Constant List
Displays the constants currently chosen for the file name format. These constants will determine the structure of the backup file names.
Free-Form Edit of Final Structure
1. Allows manual editing of the file name format by combining selected constants with custom text or separators.
2. The final structure preview will update dynamically based on your inputs.

Backup rule wizard step 5 – Mirror

Use this option to upload the backup file to another site.

Backup Type Configuration

Backup Type Tab Navigation
Navigate between different backup types (e.g., Full, Log, Differential) using the tab interface for easy configuration of individual backup rules.
Mirror Site Backup Destination Options
Configure the backup destination for a mirror site. These options are similar to the primary options in Step 2. Backup Destinations, allowing selection of a custom local directory, instance default directory or the global setting’s master backup directory.
Mirror Site Directory Options
Specify the directory for the mirror site backup. These settings mirror the Step 2 Directory Options.
Cleanup Time
Define the retention period for mirror site backups. These options are consistent with the Step 2 Cleanup Time, allowing you to specify the time in days or hours after which old backup files are deleted.
Cleanup Mode
Determine whether old backup files are deleted before or after the backup operation. These settings are identical to those in Step 2 Cleanup Mode, ensuring consistency in cleanup behavior.

Backup rule wizard step 6 – Schedules

See Scheduling options in UI for details on how to use the schedule.

Differences between template backup rule to custom server backup rule

Maintenance Templates vs. Custom Server Configuration

A Maintenance Template is associated with one or more servers and is designed to provide a general framework for maintenance rules. It allows for efficient management of multiple servers by applying consistent configurations across them.

At the Custom Server Level, you can override template settings and define more specific configurations tailored to the unique requirements of an individual server. This level of customization ensures flexibility while maintaining the foundational structure provided by the template.

Custom server – Include Exclude backup rule tab

Backup Rule Custom Server Include / Exclude Configuration Options

Toggle Rule Enable/Disable
Activate or deactivate the selected rule for the associated server.
Select Availability Groups
Choose from a list of availability groups configured on the server to apply the rule to specific groups only.
Include/Exclude Databases
Customize the rule by including or excluding specific databases on the server based on the requirements.
Final Database List
A comprehensive list of databases is generated based on the availability group selection and inclusion/exclusion criteria, providing a clear overview of the databases affected by the rule.

In Step 2 of the backup wizard under custom server configuration, the ‘custom directory’ option will exist to allow the user to select a specific directory.

Database Backup Encryption Options (only in custom server)

Enable/Disable Encryption
Toggle the encryption feature for database backups.
Select Encryption Algorithm
Choose an encryption algorithm from the list of supported options (e.g., AES-128, AES-256, Triple DES).
Select Server Certificate
Choose a server certificate for encryption. The list of available certificates is retrieved from sys.certificates on the server.

Database Reindex

Reindex maintenance rule

Purpose:

To rebuild or reorganize indexes for improved query performance.

Why It’s Important:

Over time, indexes become fragmented due to frequent INSERT, UPDATE, and DELETE Fragmented indexes slow down queries.
Reindexing reduces fragmentation, improves data access speed, and optimizes storage.

It ensures SQL Server can efficiently use indexes to speed up query execution and maintain overall database performance.

Reindex Summary Quick Access + Include Exclude

Reindex Rule Configuration

Enable/Disable Reindex Rule
Toggle the reindex rule to activate or deactivate it for the associated servers or databases.
Include/Exclude Reindex Rule Tab
Configure specific databases or database groups to include or exclude from the integrity checks associated with the reindex rule.
Configure Rule
Opens the Reindex Rule Wizard for advanced configuration and fine- tuning of reindexing settings.
Reindex Rule Summary s Quick Access Links
Provides an overview of the reindex rule configuration, along with quick access links to key settings for faster navigation and updates.

Reindex Include Exclude rule tab

See common DB Source selection for template and Custom server – include exclude for custom server level.

Reindex rule wizard step 1 – Main Options

Index Maintenance Configuration

Fragmentation Level Thresholds
1. Lower Limit: Set the minimum fragmentation level (in percentage) to trigger index maintenance. Default is 5%, as recommended by Microsoft.
2. Upper Limit: Set the maximum fragmentation level (in percentage) to determine when a rebuild or reorganization is needed. Default is 30%, based on Microsoft recommendations in Books Online.
3. Fragmentation levels are determined using avg_fragmentation_in_ percent from sys.dm_db_index_physical_stats.
Index Rebuild Options
1. INDEX_REBUILD_ONLINE: Rebuild the index online to minimize If the MSSQL edition does not support this option, it will be demote to INDEX_REBUILD_OFFLINE.
2. INDEX_REBUILD_OFFLINE: Rebuild the index offline.
3. INDEX_REORGANIZE: Reorganize the index to reduce fragmentation.
Toggle Options
1. Partition Level: Maintain partitioned indexes at the partition When enabled, fragmentation levels and page counts are checked for each partition, and appropriate maintenance (reorganize or rebuild) is performed.
2. Sort in TempDB: Use tempdb for sort operations during index rebuilds.
3. Resumable: Enable resumable online index operaions.
4. Pad Index: Apply the fill factor’s specified percentage of free space to intermediate-level index pages.
5. LOB Compaction: Compact pages containing large object (LOB) columns when reorganizing indexes.
6. Update Statistics: Automatically update statistics after index rebuild operations.
Minimum Number of Pages
1. Define a minimum index size, in pages, for Indexes smaller than this threshold are skipped. Default: 1000 pages, as recommended by Microsoft.
2. page_count is checked in dm_db_index_physical_stats.
Maximum Number of Pages
1. Define a maximum index size, in pages, for Indexes larger than this threshold are skipped. Default: No Limit.
2. page_count is checked in dm_db_index_physical_stats.
Statistics Options
1. ALL: Update both index and column statistics.
2. INDEX: Update only index statistics.
3. COLUMNS: Update only column statistics.
4. NULL: Skip statistics maintenance (default).
Statistics Modification Level
1. Specify the percentage of modified rows required to trigger statistics updates.
2. Updates are also performed dynamically based on a decreasing threshold: SQRT(number of rows * 1000).
3. Data Sources:
  – SQL Server 2008 R2 SP2+ and SQL Server 2012 SP1+: modification_ counter and rows from sys.dm_db_stats_properties.
  – Earlier Versions: rowmodctr and rowcnt from sysindexes.
  – Incremental Statistics: modification_counter and rows from dm_ db_incremental_stats_properties.
Statistics Sampling
1. Specify the percentage of the table to sample when updating statistics:
  – 100%: Full scan (equivalent to FULLSCAN option).
  – Unspecified: SQL Server automatically calculates the required sample size.
2. The StatisticsSample option uses SAMPLE and FULLSCAN options in the SQL Server UPDATE STATISTICS command.

Reindex rule wizard step 2 – Advanced Options

Advanced Index Maintenance Options

1. Time Limit

Set a maximum time, in seconds, after which no additional commands are executed.
Default: No time limit.

2. Time Delay

Define a delay, in seconds, between index maintenance commands to reduce resource contention.
Default: No delay.

3. Max Degree of Parallelism (MaxDOP)

Specify the number of CPUs to use for index rebuild operations.
If unspecified, the global maximum degree of parallelism setting is used.
The MaxDOP option in IndexOptimize corresponds to the MAXDOP option in the SQL Server ALTER INDEX command.

4. Fill actor

Define how full index pages should be when rebuilding indexes, as a percentage.
Default: Uses the fill factor defined in indexes.
The FillFactor option in IndexOptimize aligns with the FILLFACTOR option in the SQL Server ALTER INDEX command.

5. Wait at Low Priority Max Duration (Minutes)

Specify the time, in minutes, that an online index rebuild operation will wait for low-priority locks.
The WaitAtLowPriorityMaxDuration option in IndexOptimize uses the WAIT_AT_LOW_PRIORITY and MAX_DURATION options in the SQL Server ALTER INDEX command.

6. Wait at Low Priority Abort After Wait

Define the action to take after an online index rebuild operation has been waiting for low-priority locks:

Value	Description
NONE	Continue waiting for locks with normal priority.
SELF	Abort the online index rebuild operation.
BLOCKERS	Terminate user transactions that block the online index rebuild operation.

The WaitAtLowPriorityAbortAfterWait option in IndexOptimize aligns with the WAIT_AT_LOW_PRIORITY and ABORT_AFTER_WAIT options in the SQL Server ALTER INDEX command.

7. Lock Timeout (Seconds)

Set the time, in seconds, that a command waits for a lock to be released.
Default: No time limit.
The LockTimeout option in IndexOptimize corresponds to the SET LOCK_TIMEOUT statement in SQL Server.

8. Lock Message Severity

Define the severity level for lock timeout and deadlock messages:

Value	Description
10	Informational message.
16	Error message (default).

9. Execution Order

Specify the order in which databases are processed:

Value	Description
NULL	Process databases in the specified order, then sort by database name in ascending order (default).
DATABASE_NAME_ASC	Sort by database name in ascending order.
DATABASE_NAME_DESC	Sort by database name in descending order.
DATABASE_SIZE_ASC	Sort by database size in ascending order.
DATABASE_SIZE_DESC	Sort by database size in descending order.

Reindex rule wizard step 3 – Schedule

See Scheduling options in UI for details on how to use the schedule.

Differences between template reindex rule to custom server reindex rule

Custom server – Include Exclude reindex rule tab

Index Rule Configuration Options

Close Dialog
Exit the configuration dialog, discarding any unsaved changes.
Enable/Disable Rule
Toggle the rule to activate or deactivate it for the associated server or databases.
Select Predefined Database Grouping
Choose from a predefined grouping of databases for streamlined configuration.
Select Availability Database Groups
Choose from a list of availability groups on the server to apply the rule.
Select Specific Indexes
Choose from a list of specific indexes to include in the rule configuration.
Include/Exclude Databases
Customize the rule by including or excluding specific databases on the customized server.
Final Database List
A comprehensive list of databases is generated based on the selected groupings, availability groups, and inclusion/exclusion criteria, providing clarity on the databases affected by the rule.

Update Statistics

Update Statistics maintenance rule

Purpose:

To refresh metadata used by the SQL Server query optimizer for making execution plan decisions.

Why It’s Important:

The SQL Server query optimizer relies on statistics (data distribution and cardinality information) to choose the best query execution plan.
As data changes over time, statistics become stale or inaccurate, leading to poor execution plans and performance degration.
Updating statistics ensures queries run efficiently and resource usage is minimized.

Update Statistics Summary Quick Access + Include Exclude

Update Statistics Rule Configuration

Update Statistics Rule Summary
Provides an overview of the current update statistics rule, including key settings and the status of its configuration.
Include/Exclude Update Statistics Rule Tab
Allows customization by including or excluding specific databases or database groups for the update statistics operation.
Configure Rule
Opens the Rule Wizard to access advanced configuration options. for fine- tuning the update statistics settings.

Update Statistics Exclude include rule tab

See common DB Source selection for template and Custom server – include exclude for custom server level.

Update statistics rule wizard step 1 – Main Options

Advanced Options for Index and Statistics Maintenance

1. Toggle Options

Configure additional parameters to customize index and statistics maintenance:

Partition Level: Maintain partitioned indexes at the partition If enabled, fragmentation levels and page counts are checked for each partition, and the appropriate maintenance (reorganize or rebuild) is applied.
Sort in TempDB: Use tempdb for sort operations during index rebuilds.
Resumable: Enable resumable online index operations to allow interruptions and resumptions.
Pad Index: Apply the fill factor’s free space percentage to intermediate-level index pages.
LOB Compaction: Compact pages containing large object (LOB) columns during index reorganization.
Is Only Modified Statistics: Update statistics only if rows have been modified since the last update.
– Data is retrieved from modification_counter in dm_db_ stats_properties (SQL Server 2008 R2 SP2+, SQL Server 2012 SP1+).
– Earlier versions use rowmodctr in sysindexes. For incremental statistics, modification_counter in sys.dm_db incremental_stats_properties is used.

2. Minimum Number of Pages

Set a minimum index size (in pages) for Indexes with fewer pages are skipped. Default: 1000 pages, based on Microsoft recommendations.
Page size is determined using page_count in sys.dm_db_index_ physical_stats.

3. Maximum Number of Pages

Set a maximum index size (in pages) for Indexes larger than this threshold are skipped. Default: No limit.
Page size is determined using page_count in dm_db_index_ physical_stats.

4. Statistics Options

ALL: Update both index and column statistics.
INDEX: Update only index statistics.
COLUMNS: Update only column statistics.
NULL: Skip statistics maintenance (default).

5. Statistics Modification Level

Specify the percentage of modified rows that trigger a statistics update.
Updates are also triggered dynamically when the number of modified rows meets a threshold: SQRT(number of rows * 1000).
Data Sources:
– SQL Server 2008 R2 SP2+ and SQL Server 2012 SP1+: Uses modification_counter and rows in dm_db_stats_properties.
– Earlier Versions: Uses rowmodctr and rowcnt in sysindexes.
– Incremental Statistics: Uses modification_counter and rows in dm_db_incremental_stats_properties.

6. Statistics Sample

Specify the percentage of the table to sample for statistics updates:
– 100%: Equivalent to a full scan.
– Unspecified: SQL Server automatically determines the sample size.
The StatisticsSample option in IndexOptimize uses SAMPLE and FULLSCAN in the SQL Server UPDATE STATISTICS command.

Update statistics rule wizard step 2 – Advanced Options

Index Maintenance Configuration: Time and Lock Settings

1. Time Limit

Specify the maximum time, in seconds, after which no additional commands are executed.
Default: No time limit.

2. Time Delay

Define a delay, in seconds, between index maintenance commands to reduce resource contention.
Default: No delay.

3. Max Degree of Parallelism (MaxDOP)

Specify the number of CPUs to use during index rebuild operations.
If unspecified, the global maximum degree of parallelism setting is used.
The MaxDOP option in IndexOptimize maps to the MAXDOP option in the SQL Server ALTER INDEX command.

4. Fill Factor

Define how full the index pages should be when rebuilding indexes, as a percentage.
Default: Uses the fill factor specified in sys.indexes.
The FillFactor option in IndexOptimize corresponds to the FILLFACTOR option in the SQL Server ALTER INDEX command.

5. Wait at Low Priority Max Duration (Minutes)

Specify the time, in minutes, an online index rebuild operation waits for low-priority locks.
The WaitAtLowPriorityMaxDuration option in IndexOptimize aligns with the WAIT_AT_LOW_PRIORITY and MAX_DURATION options in the SQL Server ALTER INDEX command.

6. Wait at Low Priority Abort After Wait

Define the action to take after an online index rebuild operation has been waiting for low-priority locks:

Value	Description
NONE	Continue waiting for locks with normal priority.
SELF	Abort the online index rebuild operation.
BLOCKERS	Terminate user transactions that block the online index rebuild operation.

The WaitAtLowPriorityAbortAfterWait option in IndexOptimize uses the WAIT_AT_LOW_PRIORITY and ABORT_AFTER_WAIT options in the SQL Server ALTER INDEX command.

7. Lock Timeout (Seconds)

Specify the maximum time, in seconds, that a command waits for a lock to be released.
Default: No time limit.
The LockTimeout option in IndexOptimize corresponds to the SET LOCK_TIMEOUT statement in SQL Server.

8. Lock Message Severity

Define the severity level for lock timeout and deadlock messages:

Value	Description
10	Informational message.
16	Error message (default).

9. Execution Order

Define the order in which databases are processed during index maintenance:

Value	Description
NULL	Process databases in the specified order, then sort by database name in ascending order (default).
DATABASE_NAME_ASC	Sort databases by name in ascending order.
DATABASE_NAME_DESC	Sort databases by name in descending order.
DATABASE_SIZE_ASC	Sort databases by size in ascending order.
DATABASE_SIZE_DESC	Sort databases by size in descending order.

Update statistics rule wizard step 3 – Schedule

See Scheduling options in UI for details on how to use the schedule.

Differences between template update statistics rule to custom server update statistics rule

Custom server – Include Exclude update statistics rule tab

Rule Configuration Options

Close Dialog and Toggle Rule
Exit the configuration dialog while toggling the rule to enable or disable it as needed.
Navigate Between Rule Summary and Database Selection
Easily switch between the rule summary view and database selection settings for streamlined configuration.
Choose Predefined Database Group Type
Select a predefined database grouping to quickly apply rules to categorized sets of databases.
Select Specific Indexes
Choose from a list of specific indexes to apply the rule to, enabling granular control over index maintenance.
Include or Exclude Databases
Customize the rule by including or excluding specific databases on the customized server.
Final List of Databases
View the final list of databases based on the chosen database group type, selected indexes, and inclusion/exclusion criteria.

Integrity Maintenance

Integrity maintenance rule

Purpose:

To detect and report any corruption or inconsistencies in the database.

Why It’s Important:

Running DBCC CHECKDB helps identify issues with data pages, tables, indexes, and the overall physical structure of the database.
Detecting corruption early prevents data loss or operational downtime.
Regular integrity checks are critical for ensuring database reliability and identifying problems before they become severe.

Integrity Summary Quick Access + Include Exclude

Integrity Rule Configuration

Integrity Rule Summary
Displays an overview of the integrity rule, including its current configuration and status.
Include/Exclude Integrity Rule Tab
Customize the integrity rule by specifying databases or database groups to include or exclude from the checks.
Navigation Between Rule Summary and Include/Exclude Tab
Seamlessly toggle between the rule details summary and the Include/ Exclude tab for efficient configuration.
Configure Rule
Opens the Rule Wizard, providing advanced options for fine-tuning the integrity rule settings.

Include Exclude integrity rule tab

See common DB Source selection for template and Custom server – include exclude for custom server level.

Integrity rule wizard step 1 – Main Options

Integrity Rule Advanced Configuration

1. Toggle Options

Configure the following settings to fine-tune the integrity rule:

a. Physical Only:

Limit checks to the physical structures of the database.
Uses the PHYSICAL_ONLY option in SQL Server DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE commands.

b. Data Purity:

Validate column values to ensure they are not invalid or out- of-range.
Uses the DATA_PURITY option in DBCC CHECKDB and DBCC CHECKTABLE.

c. No Indexes:

Exclude nonclustered indexes from integrity checks.
Uses the NOINDEX option in DBCC CHECKDB, DBCC CHECKFILEGROUP, DBCC CHECKTABLE, and DBCC CHECKALLOC.

d. Extended Logical Checks:

Perform extended logical checks for more comprehensive validation.
Uses the EXTENDED_LOGICAL_CHECKS option in DBCC CHECKDB.
Note: Cannot be combined with the Physical Only option.

e. Table Lock:

Use table locks instead of internal database snapshots during checks.
Uses the TABLOCK option in DBCC CHECKDB, DBCC CHECKFILEGROUP, DBCC CHECKTABLE, and DBCC CHECKALLOC.

f. Max Degree of Parallelism (MaxDOP):

Specify the number of CPUs to use during database checks.
Uses the MAXDOP option in DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE.

2. Time Limit

Set the maximum time, in seconds, after which no further commands are executed.
Default: No time limit.

3. Time Delay

Define a delay, in seconds, between index commands to reduce resource contention.
Default: No delay.

4. Max Degree of Parallelism (MaxDOP)

Specify the number of CPUs to use for index rebuild operations.
If unspecified, the global maximum degree of parallelism is used.
Uses the MAXDOP option in the SQL Server ALTER INDEX command.

5. Lock Timeout (Seconds)

Set the maximum time, in seconds, that a command waits for a lock to be released.
Default: No time limit.
Uses the SET LOCK_TIMEOUT statement in SQL Server.

6. Lock Message Severity

Define the severity level for lock timeout and deadlock messages:

Value	Description
10	Informational message.
16	Error message (default).

7. Execution Order

Specify the order in which databases are processed during integrity checks:

Value	Description
NULL	Process databases in the specified order, then sort by database name in ascending order (default).
DATABASE_NAME_ASC	Sort databases by name in ascending order.
DATABASE_NAME_DESC	Sort databases by name in descending order.
DATABASE_SIZE_ASC	Sort databases by size in ascending order.
DATABASE_SIZE_DESC	Sort databases by size in descending order.

Integrity rule wizard step 2 – Schedule

See Scheduling options in UI for details on how to use the schedule.

Differences between template integrity rule to custom server integrity rule

Custom server – Include Exclude integrity rule tab

Rule Configuration Options

Close Dialog and Toggle Rule
Exit the configuration dialog while enabling or disabling the rule for the associated server or databases.
Navigate to Rule Details
Access the detailed configuration settings of the selected rule for review or modifications.
Choose Predefined Database Group Type
Select a predefined grouping of databases for streamlined rule application.
Select Availability Groups
Choose specific availability groups from the server to include in the rule.
Select Replica Types
Specify the types of replicas within the availability groups to participate in the rule:
- PRIMARY: Include primary replicas.
- SECONDARY: Include secondary replicas.
Select Specific Objects
Choose specific objects (e.g., tables, indexes) to include in the rule for granular control.
Include or Exclude Databases
Customize the rule by including or excluding specific databases on the customized server.
Final List of Databases
View the final list of databases derived from the chosen group types, availability groups, replica types, and inclusion/exclusion criteria.

Shrink Maintanance

Shrink maintenance rule

Purpose:

To reclaim unused space in the database

Why It’s Important:

Shrinking reduces database file size when there is significant free space (e.g., after deleting large amounts of data).
However, caution is required because frequent shrinking can cause fragmentation and performance issues.
It is best used after significant cleanup or archiving operations to optimize storage usage.

Shrink rule wizard step 1 – Optimization Options

Shrink Operation Configuration

1. Shrink Data Files

Specify whether the operation should shrink data files.
Define the desired amount of free space to leave in the data files after the shrink operation is complete.

2. Shrink Log Files

Specify whether the operation should shrink log files.
Define the desired amount of free space to leave in the log files after the shrink operation is complete.

3. Backup Log Files

Indicate whether the log file should be backed up before the shrink operation begins, ensuring data recovery capabilities.

4. Log File Truncate Only

Enable support for the truncate-only log file option, primarily for legacy system compatibility.

Restore from Backup

Restore maintenance rule

Purpose:

To recover a database from a backup.

Why It’s Important:

Database restore is essential for recovering from data loss, corruption, or system failures.
It ensures business continuity by restoring the database to its previous state.
Regularly testing restores validates your backup strategy and ensures backups are functional when needed.

Restore rule wizard step 1 – Destination

Restore Source Selection

Restore from Backup History
1. Retrieve a mapping of backup history for the selected server and database.
2. Change the source database by using the Choose button (described in 4).
Search by File Names
Search from a list of available backup file names for quick selection.
Apply Filters
Use predefined filters to narrow down the list and pinpoint the backup file you are searching for.
Choose Button
a. Switch to another server and database to search their backup history.
b. Only retrieves backup files that still exist in their original locations.
Existing Backup Files List
Displays the list of located backup files. Files are verified to exist in their original locations as recorded in msdb.

Restore rule wizard step 2 – Restore from backup files source

Restore from Manually Selected Backup Files

1. Manual Backup File Selection

Opt to restore the database using manually selected backup files instead of relying on backup history.

2. Add Files Button

Click to open a remote directory browsing control that allows you to navigate and select files from the directories on the destination server.
Note: The backup files must reside on the destination Multiple files may be used. Experda will sort out the correct logical order for the files to be restored.

3. List of Added Files

Displays the files you have selected for the restore operation.
Important: If the selected backup files contain multiple databases, you will be prompted to choose which database to restore.

Restore rule wizard step 2 – Restore to point in time source

Point-in-Time Recovery Configuration

1. Point-in-Time Recovery Option

Select this option to perform a point-in-time recovery using the backup history stored in the msdb database of the selected source server and database.

2. Change Source Server/Database

Modify the source server or database. The backup history from the newly selected source server’s msdb database will be scanned and used for recovery.

3. Select Date and Time

Choose the desired recovery date and time from the timeline interface.
Use timeline interval options (e.g., hourly, daily) to refine the view and easily locate the preferred recovery point.

4. Timeline Control

Visualize the available backups on the timeline.
See the corresponding recovery points and determine the point-in- time restore availability based on the selected backups.

Restore rule wizard step 3 – Additional options

Post-Restore Options and Advanced Settings

1. Keep Database Open After Restore

Specify whether the database should remain open and accessible immediately after the restore operation completes.

2. Fix All Database Users

Automatically resolve mismatched database users by mapping them to the correct server logins.

3. Database Data File Location

Define the target location for the restored database’s data files.

4. Database Log File Location

Specify the destination for the restored database’s log files.

5. Buffer Count

Configure the number of I/O buffers to use during the restore process for optimal performance.

6. Max Transfer Size

Set the maximum size (in bytes) for each I/O operation during the restore process.

7. Block Size

Define the block size used for the restore operation.

8. Full Database Backup Before Restoration

Perform a full backup of the existing database before initiating the restore operation.
Note: This option is available only when the restore operation overwrites an existing database.

9. Full Database Backup Options

Configure the full backup options before the restore, similar to the backup command settings:

Choose a custom location for the backup file.
Enable backup file compression.
Set an expiration period for the backup file to be automatically purged.

Restore rule wizard – summary

Maintanance Templates

How to create a new maintenance template

Template Configuration

1. Select a Template Name

Choose a unique and meaningful name for the template to ensure easy identification and organization.

2. Choose Template Type

Regular Template:
- Applicable for all MSSQL editions except Express Edition.
- Rules in regular templates are implemented as MSSQL jobs.
Express Template:
- Designed specifically for MSSQL Express Edition.
- Rules in express templates are implemented as Windows Task Scheduler jobs.
Key Difference: Regular templates use MSSQL jobs, while express templates rely on Windows task scheduling.

3. Mark as Favorite

Highlight frequently used templates by marking them as favorites.
A star icon will appear next to the template name for quick Example: ★

Editing a maintenace template rule

Rule Editing Levels

1. Summary and Include/Exclude Tab

The first level of editing involves accessing the Summary and Include/Exclude tabs for each rule.
Use this level to review rule settings and manage database inclusion or exclusion.

2. Detailed Rule Configuration

The second level allows for in-depth configuration of the rule.
From the open tab, press the Configure Rule button to navigate to the corresponding Rule Wizard, where advanced options can be customized.

Attaching or removing servers from your maintenance template

Maintenance Template Management

1. Navigate Between Affiliated Servers and Rules
Seamlessly switch between the list of affiliated servers and the configured rules in the maintenance template \ custom view for streamlined management.

2. Add Server Button
a. Opens the Select Servers Dialog (displayed on the right), allowing you to add or remove servers from the template.
b. Important:

If a server is removed, all its linked rules will also be removed.
Unlinked or custom rules will remain and be converted to custom rules.

3. View and Manage Attached Servers
a. View a comprehensive list of servers attached to the template.
b. Remove a single server by clicking the X icon next to its name.

Including or excluding specific rules from your maintenance template

Include/Exclude Rules in Maintenance

• Template Level:

Exclude or include specific servers from a rule. For instance, if a template is attached to 20 servers, you can exclude a specific server from having that rule applied.

• Custom Server Level:

Similar functionality applies to databases. Include or exclude specific databases from a rule on the custom server level.

• Steps to Include/Exclude a Rule:

Locate the desired rule and click the Edit Rule button.
Navigate to the Include/Exclude tab.
Select the Include or Exclude operation.
On the right side of the screen, choose from the list of servers or databases to apply the operation.

Understanding the difference between “linked” \ “Unlinked” \ “Customized”

Server Rule Statuses

A server rule may have one of the following four statuses:

Linked
a. The server is attached to a template, and the rule is inherited exactly as configured in the template.
b. Behavior:
- If the server is detached from the template, all linked rules will be removed from the server.
Unlinked
a. The server is attached to a template, but the rule has been modified or customized for the specific server (e.g., changes to the rule schedule or parameters).
b. Behavior:
- An unlinked rule can be reset to match the template configuration.
- If the server is detached from the template, an unlinked rule becomes a custom rule.
Custom Rule
a. A rule created specifically for the server.
b. Behavior:
- Custom rules are independent and can exist whether or not the server is attached to a template.
Excluded
The rule exists in the attached template but has been specifically excluded for this server.

How to customize maintenance rules for your server

At Experda, we recommend utilizing server templates as they streamline the management of multiple instances by providing a centralized configuration. However, we understand that specific customization is sometimes unavoidable.

Customizing a Specific Server:

Switch to the Custom Setup view within the maintenance template view.
The Custom Setup view is nearly identical to the template view but allows for more granular control.
- Add new custom backup or optimization rules specific to the server.
- Edit an existing linked template rule to unlink it and customize it according to your server’s unique requirements.

This approach ensures flexibility while maintaining the efficiency and structure provided by templates.

Global settings in Experda

Global settings are overarching configurations applied at the client region level. These settings define critical aspects such as agent configurations, backup directory structures, and centralized backup directories for all servers.

• Template and Custom Definitions:

Global settings are typically based on templates but can also be customized for specific servers when necessary.
Every server must be connected to a template since default global settings must be defined during initial

• Impact of Global Settings:

These advanced settings affect critical modules such as charts, aggregations, alerts, and more within the Experda
Any changes made to global settings take a few minutes to propagate to all connected servers.

• Global Settings Categories:

1. Agent Properties:

Define the overall agent
Example: Specify the number of parallel Extended Events

allowed per server for monitoring suspicious activities.

2. Backup Settings:

High-level backup configurations, such as defining the backup folder structure.
Detailed backup rules can be configured under the

Maintenance s Backup section.

3. Purging Policy Settings (future)

Define how long data is retained within the
Each purging setting applies to a specific table, ensuring effective data management.

Navigating To The Global Settings

The Global settings Template View

Global Settings Interface Overview

1. Toggle Between Views

Switch between the Template View and the Custom Server View to manage global settings.

2. 3-Dot Menu for Template Management

Access the 3-dot menu to manage the selected template:

Set as Default: Designates the current template as the default global
- Only one default template is allowed per
- New servers are automatically attached to the default template.
Duplicate Template: Creates a copy of the current template with a new name, retaining all its settings.
Edit Template: Opens the edit template dialog for
Delete Template: Removes the current
If the deleted template is the default, another template will automatically be selected as the new default.

3. Create New Template

Click the Create New Template button to open the dialog for defining a new template.

4. Toggle Between Tabs

Switch between the Template Rules and Affiliated Servers tabs for detailed configuration.

5. Summary and Search

Use the summary area to review high-level details and search for specific rules or settings.

6. Template List

View all available templates, including a star indicator for favorite templates.

7. Agent Settings Grid

Edit agent-related configurations directly within the grid.

8. Backup Settings Grid

Modify backup-related settings for global consistency.

9. Data Purging Policy Grid

Configure data retention policies for various tables to manage storage effectively.

Global Settings Custom Server View

For most functionality, please see the above section “Global settings template view”. In this section we will highlight the differences between the views.

Custom Server Configuration

Assign a Different Template
Directly assign a different template to the server from this interface.
List of Servers
View and manage all servers associated with the selected template or custom settings.
Unlinked Global Settings
1. At the custom server level, each global setting can be placed in an “unlinked” state, indicating it has been customized and differs from the attached template.
2. Revert to Template:
  Click the X button to reset the unlinked global setting back to the template’s default configuration.

Appendix A – Alerts

Always On Database Not Healthy	Always On	AO Availability group database not healthy
Always On Database not ready for automatic failover	Always On	Always On Database not ready for automatic failover
Always On Error/Failover events	Always On	Always on error event occurred (possible failover)
Always On Replication Time Lag Secondary Commit Behind	Always on	Always On Replication Time Lag Secondary Commit Lag Over @PARAM1 sec
Always On Replication Time Lag Log sent	Backup G Restore	Always On Replication Time Lag Log sent from Primary to Secondary
Database Bacfiup Operation Occured	Backup G Restore	Backup operation occurred
Database Bacfiup Too Old	Backup G Restore	No @PARAM1 backup for over @PARAM2 @ PARAM3
Database No Bacfiup Exists	Backup G Restore	Database has no @PARAM1 backups
Database Old Restore	Backup G Restore	Last restore older than @PARAM1 day
Database Restore Operation Occured	Backup G Restore	Restore operation occurred on database
Database Blocfiing	Database Diagnostics	Process is BLOCKING for over @PARAM1 sec onds
Database Fragmented Index	Database Diagnostics	Alert if found fragmented index above @ PARAM1% (min index size: @PARAM2 pages)
Database Has Connections	Database Diagnostics	Database has active connections
Database Integrity Checfi Over due	Database Diagnostics	Last database integrity check check was
Database Object Locfi	Database Diagnostics	@PARAM1 object is LOCKED for over @PARAM2 seconds
Database Object Type Locfi	Database Diagnostics	@PARAM1 object LOCK occurs for over @ PARAM2 seconds
Database Page Verification Set To None	Database Diagnostics	Alert If Database Page Verification Set To None
Database Process Locfi	Database Diagnostics	Process is LOCKING for over @PARAM1 seconds
Server Deadlocfi Monitor	Database Diagnostics	Deadlock event occurred
Database Data File Size Usage	Database Files	Data file size usage is above @PARAM1%
Database Data+Log Same Drive	Database Files	Database DATA files and LOG files are on the same logical drive
Database Log File Size Usage	Database Files	Log file size usage is above @PARAM1%
Database Log Size	Database Files	Database LOG size is over @PARAM1% of the to tal database size (min log size: @PARAM2 MB)
Database Total Data Size	Database Files	Database DATA size is over @PARAM1 MB
Database Total Log Size	Database Files	Database LOG size is over @PARAM1 MB
Database Total Size	Database Files	Database TOTAL size is over @PARAM1MB
Server Database Created	Database Monitoring	Database was created/attached
Server Database Deleted	Database Monitoring	Database was deleted/detached
Database Recovery	Database Properties	Database RECOVERY is not @PARAM1
Database Status	Database Properties	Database STATUS is not @PARAM1
Database Updateability	Database Properties	Database UPDATEABILITY is not @PARAM1
Database User Access	Database Properties	Database USER ACCESS is not @PARAM1
Dynamic_Wmi	Dynamic	Dynamic Alert @PARAM1 @PARAM2 for @ PARAM3
AdHoc Distributed Queries	Health Check	Ensure ‘Ad Hoc Distributed Queries’ Server Con figuration Option is set to ‘0’
Cross DB Ownership Chaining	Health Check	Ensure ‘Cross db ownership chaining’ Server Configuration Option is set to ‘0’
Everyone Has Remote Access	Health Check	The list of groups/users that have access to this computer from the network contains the group ‘Everyone’
OS Administrator Default Re moval	Health Check	The default local user ‘Administrator’ is still de fined on the machine
OS Audit Logon Attempts	Health Check	Test user logon attempts
OS Guest Disabled	Health Check	Local user Guest is enabled
OS Is NTFS	Health Check	Any Drive nonNTFS format
OS No Dev Tools	Health Check	Development tools exist in the environment
OS Null Session	Health Check	The OS is allowing Null session access
OS Only TCPIP	Health Check	Active Network Card
OS Remove Default Shares	Health Check	Default shares are defined on the server machine
OS Shutdown Privs	Health Check	The shutdown privilege is given to all Users, which is a bad practice for important systems
OS SQL Services Recommended Status	Health Check	SQL Services Recommended Statuses Verified
OS Too Many Shared	Health Check	Too many shares are defined on the machine
OS Unnecessary Users List	Health Check	Too many local users were found on the machine
Remote Accesses	Health Check	Ensure ‘Remote Accesses’ Server Configuration Option is
Remote Admin Connections	Health Check	Ensure ‘Remote Admin Connections’ Server Con figuration Option is set to ‘0’
SQL C2 Auditing	Health Check	C2 Audit Tracing is disabled for service
SQL Clr On	Health Check	clr operations are enabled on the server
SQL DeadlLocfi Flag Disabled	Health Check	SQL DeadlLock Flag Disabled
SQL No Sample Databases	Health Check	Sample DBs exist in an environment
SQL Not Default Port	Health Check	Verify not default port used
SQL Only TCPIP	Health Check	SQL Is Only In TCP\IP protocol
SQL Password Columns	Health Check	Suspicious column holding sensitive information. It should be considered for encryption.
SQL Single Instance	Health Check	Not more than one active SQL Server instance per machine
SQL Too Many Have SA Privi leges	Health Check	Too many defined logins with SA privileges
SQL Windows Authentication	Health Check	SQL Server Authentication is enabled
SQL xp_cmdshell Enabled	Health Check	xp_cmdshell is enabled
Machine Any Drive Free Space	Logical Drives	Drive free space is under @PARAM1 @PARAM2
MachineSpecific Drive Free Space	Logical Drives	Drive @PARAM1 free space is under @PARAM2 @PARAM3
Machine Disfi Bytes Per Sec	Logical Drives Per formance	logical partition ‘@PARAM1’ disk bytes per sec is above the value @PARAM2
Machine Disfi Reads Bytes Per Sec	Logical Drives Per formance	logical partition ‘@PARAM1’ disk read bytes per sec is above the value @PARAM2
Machine Disfi Writes Bytes Per Sec	Logical Drives Per formance	logical partition ‘@PARAM1’ disk write bytes per sec is above the value @PARAM2
Free Physical Memory Below Threshold	Memory	Free memory is under @PARAM1 @PARAM2
Machine SQL Process Memory Over Threshold	Memory	SQL Server memory usage is over @PARAM1 MB
Agent Monitor SQL Connection Failed	Monitor	Agent Monitor SQL Connection Failed
Agent Monitor Stopped	Monitor	Agent monitoring has stopped for more than @ PARAM1 minutes
Agent Monitor WMI Connection Failed	Monitor	Agent Monitor WMI Connection Failed
Agent Monitoring Error	Monitor	Agent Monitoring Error
Managed Server NonRespon sive	Monitor	Managed server negative ping response
Machine Process CPU Over Threshold	Process Per formance	Process @PARAM1 CPU % is above @PARAM2
Machine Process Memory Over Threshold	Process Per formance	Process @PARAM1 Memory Consumption is above @PARAM2
Machine Process Paging Over Threshold	Process Per formance	Alert if paging for process @PARAM1 is above @ PARAM2 MB
Machine SQL Process CPU Over Threshold	Process Per formance	SQL Server CPU usage is over @PARAM1 %
Machine AVG CPU Over Thresh old	Processor	average CPU is above @PARAM1% (AVG for last @PARAM2 minutes)
Machine Total CPU Over Threshold	Processor	CPU usage is above @PARAM1 %
Machine Total CPU Under Uti lized Threshold	Processor	CPU usage is below @PARAM1%
Server Any Job Status	Scheduled Operations	Any job execution has @PARAM1
Server Any Job Status (With Excluded)	Scheduled Operations	Alert if ANY Job execution (excluded jobs: @ PARAM1) has changed status
Server Job Duration Unusual	Scheduled Operations	Alert if ANY Job duration is @PARAM1% longer than usual (min job duration: @PARAM2 seconds)
ServerSpecific Job Status	Scheduled Operations	One or more of SQL Jobs @PARAM1 execution has @PARAM2
Server Blocfiing Occured	Server Diag nostics	Process is BLOCKING for over @PARAM1 sec onds
Server Mode Locfi Monitor	Server Diag nostics	@PARAM1 object LOCK occurs for over @ PARAM2 seconds
Server Process Locfi Monitor	Server Diag nostics	Process is LOCKING for over @PARAM1 seconds
Server Type Locfi Monitor	Server Diag nostics	@PARAM1 object is LOCKED for over @PARAM2 seconds
SQL Agent Is Down	Server Diag nostics	SQL server agent is down
SQL Server Error Log Entry	Server Diag nostics	SQL Server Error Log Entry
SQL Service Is Down	Server Diag nostics	SQL server service is down

Appendix B – List of Health Tests

*Name*	*Category*	*Description*
Remove/Disable Administra tor User	OS	The default Administrator is one of the biggest security risks that any server has. All hackers know that the system administrator will usually not remove this user. Therefore, they do not need to first hack the user list. They simply aim directly at this user to start with. This user is the 1 target for security breaches. It often uses the same password as many other servers for the simplicity of managing the domain. We highly recommend completely removing this user and using dedicated users for specific tasks.
Disable Disallowed Net worfis	OS
Audit Failed Connection Attempts	OS	We highly recommend auditing any failed login attempts to your operating system, as that will allow monitoring of repeated hack attempts.
Audit Login Attempts	OS	We highly recommend auditing any login at tempts to your operating system. Monitoring all logins will allow a better history search for poten tial security breaches.
Too Many Administrator Users	OS	The number of users with administrator privileges or the number of users that belong to the admin istrator’s group is very large. Having too many administrator users shows a bad design strategy. The more users that are defined as administra tors, the less control you have over your server. A hacker will have more users to choose from, and not all users choose powerful passwords, even when a password policy exists.
Disable Guest User	OS	The user ‘Guest’ is one of the first users that any skilled hacker will attempt to utilize to infiltrate your server and then your network. The user Guest should always be disabled.
Disfi Is Not Formatted In NTFS	OS	A disk has been found to have a file system other than NTFS. When using the Windows operating system, using the NTFS file system has been proven to be the most secure file system. You can easily format your disk by rightclicking on the disk name in Windows Explorer, selecting “format,” and in the dialog, select “NTFS” as the file system format. Remember that all informa tion will be deleted from the selected disk, so make sure you backup any materials you might have before formatting your disk.
Development Tools Installed	OS	Microsoft Development tools have been found on the server. Having Office, books online, sample code or any other development software installed on a production environment is not rec ommended.
Restrict Anonymous Login	OS	The Restrict Anonymous SSA check determines whether the RestrictAnonymous registry setting is used to restrict anonymous connections on the scanned computer. The registry setting is at the following location: HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\Control\LSA\Re strictAnonymous Anonymous users can list cer tain types of system information, including user names and details, account policies, and share names. The list of user names and share names could help potential attackers learn compromis ing information, such as who is an administrator, which computers have weak account protection, and which computers share information with the network. Users who want enhanced security can restrict this function so that anonymous users cannot access this information. The RestrictA nonymous registry setting controls the level of enumeration that is granted to an anonymous user.
Server More Than One Job	OS	For a server to be completely secured, it cannot function as both an SQL Server server and have other functions, such as being an application server, email server or domain controller. Having a single SQL Server machine serve more than one purpose opens the server for potential secu rity threats by running unknown processes, hav ing additional open ports and shares and adds many unknowns to the overall security that is required to protect your data from being hacked from the outside.
Other Networfis Options Enabled	OS	For every defined active network in your system, there are many activated items you may choose from. By default, Windows will enable multiple access methods and protocols upon configura tion of a new network. As each protocol enables another method of access, the more protocols you enable, the more breaches you will have in your server’s security. We recommend preserv ing the bare minimum, which is the TCP\IP proto col alone. To change the protocols, go to Control Panel and select “Network and Sharing Center”. Then select “Change adapter settings”. Choose the active network and rightclick “Properties.” There, under the “Networking” tab, you will see the list of enabled items for the selected network.
Disable Additional Remote Access Disable Additional Remote Access	OS	This policy setting determines which users or groups can access the login screen of a remote device through a Remote Desktop Services con nection. It is possible for a user to establish a Re mote Desktop Services connection to a particular server but not be able to log on to the console of that same server. To control who can open a Remote Desktop Services connection and log on to the device, add users to or remove users from the Remote Desktop Users group. By default, members of the Administrators group have this right on domain controllers, workstations, and servers. The Remote Desktops Users group also has this right on workstations and servers. The following table lists the actual and effective de fault policy values. Default values are also listed on the policy’s property page.
Defined Default Shares	OS	The Windows operating system is installed with multiple default shares that have been proven to be easily breached. The most known default shares are created for each disk right on the root. When choosing whether to remove a default share, please consult with your system admin istrator to make sure no programs are actively using it.
NTFS Directory MSSQL Ser vice Permissions	OS	Test the NTFS directory permissions for the DATA and LOG directories defined on the MSSQL ser vice. We recommend that only the user that is defined to run the particular MSSQL Service will have full permissions on those directories. Any thing else is considered to be a security breach.
Service Status Disable Rec ommendations	OS	Some SQL Server services that don’t need to run shouldn’t be used and should be disabled. For example, if you are not using the job scheduler we recommend disabling the SQL Server Agent service.
Too Many Shares	OS	Each share on a remote server opens the server to potential hacks.
Too Many Local Users	OS	Having a very large user list on a production server may increase server vulnerability as not all users secure their passwords even when a prop er password policy exists.
Operating System Is Not Updated	OS	The operating system is not updated with the latest updates
MSSQL Login Auditing	SQL	Auditing all server logins is a security best prac tice and allows the analysis of server hacks.
MSSQL C2 Audit Recom mendations	SQL	C2 audit mode can be configured through SQL Server Management Studio or with the c2 audit mode option insp_configure. Selecting this option will configure the server to record both failed and successful attempts to access statements and objects. This information can help you profile system activity and track possible security policy violations. C2 audit mode saves a large amount of event information to the log file, which can grow quickly. If the data directory in which logs are being saved runs out of space, SQL Server will shut itself down. If auditing is set to start au tomatically, you must either restart the instance with the fflag (which bypasses auditing) or free up additional disk space for the audit log. C2 audit mode data is saved in a file in the default data directory of the instance. If the audit log file reaches its size limit of 200 megabytes (MB), SQL Server will create a new file, close the old file, and write all new audit records to the new file. This process will continue until the audit data directory fills up or auditing is turned off. To determine the status of a C2 trace, query the sys.traces catalog view. <p class=””>Note <p class=””>This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this fea ture. The C2 security standard has been super seded by Common Criteria Certification.
MSSQL Disable CLR	SQL	The common language runtime (CLR) integration feature is off by default. It must be enabled in order to use objects that are implemented using CLR integration. This is to protect the database system from running CLR code.
Deadlocfi flag unused	SQL	The deadlock flag has been found to be disabled. We recommend tracking deadlocks using the deadlock flag.
MSSQL Encrypted Bacfiups (From 2014)	SQL	Having a backup strategy is a best practice for the survival of the information. However, having single backup files located on the local disk or in the network without the actual information being encrypted is no different than having your com puter without password protection. Your data is the most important thing your organization has, and it is important to protect it in all forms, including in the backup file form.
MSSQL Enforce Password Policy	SQL	SQL Server can use Windows password policy mechanisms. The password policy applies to a login that uses SQL Server authentication and to a contained database user with a password. SQL Server can apply the same complexity and expiration policies used in Windows to pass words used inside SQL Server. This functionality depends on the NetValidatePasswordPolicy API. SQL Database enforces password complexity. The password expiration and policy enforcement sections do not apply to SQL Database.
MSSQL Sample Databases	SQL	In a production environment, we recommend having only active databases.
MSSQL Default Port	SQL	When attempting to retrieve information from your SQL Server databases, any hackers will usu ally attempt to enter the server via the defined default SQL Server ports. A security best prac tice will be to change the default ports.
MSSQL TCP\IP Only Protocol		All network protocols are installed by SQL Serv er Setup but may or may not be enabled. Use Shared memory and Named Pipes only if you have to. We recommend minimizing your usage only to TCP/IP to protect yourself from potential hacks.
MSSQL Password Columns	SQL	Keep the information marked as “password” en crypted.
MSSQL Default SA	SQL	SA is the first of the SQL Server and, therefore, is the first entry point for a hacker to try to test. We recommend compartmentalization of your users to roles and not to use the SA user.
MSSQL Single Instance	SQL	For a highly productive and stateoftheart se curity, only one instance of SQL Server should run on the machine at a time.
MSSQL Encrypted Login	SQL	Using encrypted logins is more secure than using regular logins.
MSSQL Too Many Users Have SA	SQL	Too many users have been granted System Administrator permissions. The more users with this full access permission, the less secure your server is.
MSSQL Unnecessary Users	SQL	Unless your system requires individual local user definitions, there is no need to have so many local users defined. It is better and smarter to use Server Roles and compartmentalize your permissions.
MSSQL Updated Patches	SQL	Keep up to date with the latest SQL Server patches and updates.
MSSQL Mixed Mode	SQL	Using a mixedmode authentication mode is less secure than having Windows only authentication mode.
MSSQL Disable Cmd Shell	SQL	xp_cmdshell Spawns a Windows command shell and passes in a string for execution. Any output is returned as rows of text. The xp_cmdshell op tion is a SQL Server server configuration option that enables system administrators to control whether the xp_cmdshell extended stored proce dure can be executed on a system. By default, the xp_cmdshell option is disabled on new instal lations. Before enabling this option, it is important to consider the potential security implications associated with the use of this option. Newlyde veloped code should not use this option as it should generally be left disabled.

Anomalies & AI Insights

The Anomalies module in Experda uses a machine-learning model (Isolation Forest) to automatically detect abnormal behavior in your SQL Server and host performance metrics.

It is designed to:

Learn what is “normal” for each metric on each
Continuously scan for unusual patterns and sudden
Present anomalies visually in the AI Insights
Notify DBAs and ops teams via email and group Processing schedule
Detection runs: every 2 hours on Experda’s own
Learning phase: baselines are retrained once every 24 hours so the model adapts to new workloads.

Use the Anomalies module when you want Experda to act like an always- on “early warning system” for performance issues—catching unusual behavior before users complain and highlighting incidents that are easy to miss in daily monitoring. Instead of relying on fixed thresholds (which often create noise), the module learns what is normal for each server and metric (including hour-of-day and weekend patterns) and then flags meaningful deviations such as sudden CPU jumps, unexpected disk queue spikes, abnormal SQL activity changes, or new usage patterns.

Practically, you use it for proactive troubleshooting, spotting regressions after deployments, detecting emerging bottlenecks, and ensuring the right people get notified through alerts and 12-hour digests—without needing to watch dashboards all day.

Real Case Example

In this example, Experda detected Disk Queue Length anomalies (OS-level) on server 192.168.15.50 for three different volumes: F: (1 anomaly), G: (4 anomalies), and M: (27 anomalies). The blue line shows the queue length over time, and the red markers highlight the specific time points that were statistically unusual compared to the server’s normal behavior. Notice how M: has both many more anomaly events and much larger spikes, which strongly suggests that this specific disk/volume experienced repeated I/O pressure during the period (a backlog of requests waiting to be served).

It’s good to know because Disk Queue Length is an early indicator of storage bottlenecks: when a drive’s queue builds up, applications (including SQL Server) can slow down even if CPU looks fine. Seeing the anomalies per drive helps you immediately narrow the investigation to the right volume (here, M:), and then correlate it with related metrics (Disk Bytes/sec, Read/ Write Latency) and real events (backups, ETL loads, index rebuilds, antivirus scans, file copies) to prevent performance incidents from repeating.

1. How the anomaly engine works

1.1 Data collection

Experda collects performance metrics both from:

Host / OS – g., CPU, Disk Bytes/sec, Disk Queue Length, Disk Latency.
SQL Server – e.g., Batches/sec, Compilations/sec, Distinct client computers, etc.

Each metric is stored as a time series per server and (where relevant) or instance (for example, drive letters C:, E:, F:). This information is available in the UI under the perfomance tabs.

1.2 Isolation Forest model

Experda uses an Isolation Forest model per metric:

The model learns What “normal” looks like from the last 24 hours of data.
Every 2 hours, it scores the latest samples.
Points that are statistically isolated from normal behavior are labeled as anomalies.

The model doesn’t rely only on raw metric values; it also uses context features such as:

Hour of day
Weekend/holiday vs weekday
Recent averages and standard deviation
Distance from previous peaks
Large changes compared to previous samples
Threshold violations (above configured baseline or percentile)

These features are visible in the Data Analysis tab for each anomaly.

1.2.1 Isolation Forest – How It Works

Isolation Forest is an outlier-detection algorithm built on a very simple idea:

Anomalies are few and different, so they are easier to isolate than normal points.

Instead of modeling “normal” behavior directly (like many other algorithms), it repeatedly splits the data at random and measures how many splits are needed to isolate each data point:

Normal points live in dense regions ⇒ need many splits to isolate.
Anomalies sit far away or in sparse regions ⇒ often isolated with very few splits.

The fewer splits a point needs on average, the more anomalous it is.

1.2.2 Building the forest

Random subsampling

To make the algorithm fast and robust, each tree is built on a small random subset of your data (for example, 256 points taken from the full time series of a metric).

This has several benefits:

Reduces the cost of training.
Increases the chance that outliers stand out clearly in at least some trees.
Makes the method scalable to very large datasets.

Growing one isolation tree For each isolation tree:

Start at the root with all points in the subsample.
Randomly choose a feature (e.g., CPU, Disk Queue Length, or a derived feature such as “Avg last values”).
Randomly choose a split value between the min and max of that feature.
Split the data into left/right child nodes.
Repeat steps 2–4 on each child node until:
– The node contains just 1 point, or
– You reach a maximum depth.

This creates a random tree where every leaf node contains a tiny region of the feature space (and usually one point). Many trees = a forest

You repeat this process dozens or hundreds of times, each time with:

A different random subsample.
Different random features and thresholds.

The result is a forest of isolation trees.
Each tree provides a noisy estimate of how easy it is to isolate each point; the forest average smooths out the randomness.

1.2.3 Scoring anomalies

Path length

For a given data point (for example, “Disk Queue Length = 17 at 10:01 AM, weekday, large change from previous, above threshold”):

You drop it down each tree.
Count the path length = how many splits from root to leaf.

If a point is an outlier, many trees will isolate it very high up in the tree → short path.

Normal points require more splits to isolate → long path.

Normalizing the score

Path length depends on sample size, so Isolation Forest uses a normalization function to convert it into an anomaly score between 0 and 1, roughly:

Score ≈ 1 → very anomalous (isolated quickly).
Score ≈ 5 → borderline.
Score ≈ 0 → very normal.

In Experda’s context, this score is then combined with your rule configuration (thresholds, consecutive events, weekend/hour of day effects, etc.) to decide whether to register an anomaly event and possibly trigger an alert.

The Anomaly score in Isolation Forest is non-intuitive. The lower the score the stronger the anomaly indicators are.

1.2.4 Why Isolation Forest is a good fit for performance metrics

No need for labeled data

You don’t need thousands of labeled “bad” incidents. The algorithm simply learns “normal” structure and then exposes what doesn’t fit.

Handles high-dimensional context

It can work on multiple features at once:

Raw metric value (e.g., Disk Queue Length).
Time features (hour of day, weekend).
Statistical features (running average, standard deviation).
Change features (large jump from previous value, distance from last peak).

Robust to non-linear patterns

Random splits can approximate complex shapes in the data without any explicit modeling.

Fast and scalable

Because it uses small subsamples and simple trees, training every 24 hours and scoring every 2 hours is computationally efficient, even for many servers and metrics.

1.3 Severity and status

Each anomaly rule has a Severity level:

Information – minor deviations, useful for awareness and trend analysis.
Warning – likely performance or stability risk, should be investigated.
Critical (if configured) – high risk of downtime or severe degradation.

An anomaly can also have status:

Active – anomaly is still relevant for the current period.
Dismissed – hidden from normal views but kept for history.
Excluded time – the specific time range is excluded from future training (useful for planned maintenance). This feature is usable when you recognize that an anomaly occurs in a timeframe that you are already aware of the normality of the For example if a backup occurs every day at 1am – you can expect a heavier load on the disks at that time each day.

2. Working with AI Insights

2.1 AI Insights overview

Open the AI Insights tab to see a dashboard of anomalies for the selected time range. Make sure to select a proper date range from the timeframe control.

Each tile represents a metric on a specific server/instance:

Metric name + Instance (e.g., Disk Queue Length – E:).
Server and anomaly master type (OS / SQL).
A line chart with red points marking detected anomalies.
A numeric badge that indicates the number of anomalies for that metric in the selected time window.
Show details button to open the Enhanced anomaly analysis.

Use this view to quickly see which servers and which metrics have been behaving abnormally.

2.2 Enhanced anomaly analysis – Details tab

Click Show details from the Anomaly overview tile to open the Enhanced anomaly analysis window.

Here you’ll see:

Summary header
– Metric name (e.g., Disk Queue Length). գ Severity chip (e.g., Warning).
– Status (e.g., Active).
– Timestamp when the anomaly was detected.
– Server and instance (e.g., Server: 192.168.15.50, Disk: F:). գ Current anomalous value.

Simple explanation banner
A short text listing the top contributing features that led the model to mark this as an anomaly, for example:

“Top contributing features: Has Large Change From Previous, Time since last peak, Current value”. See Anomaly dimensions for a deeper understanding.

Hover over the anomaly to see it’s score and time

Metric timeline

A detailed time series chart with normal samples and highlighted anomaly points. You can visually see how the anomaly compares to the rest of the period.

Other related metrics

Small charts for metrics that often correlate with the current anomaly metric (e.g., Disk Bytes/sec, Disk Read/Write Latency when looking at Disk Queue Length). This helps you understand cause and effect.

Actions
– Compare graphs – compare this metric with other metrics on the same server.
– Exclude time – mark this time range as “normal” (for example, maintenance), and this anomaly will not be detected for this time frame in the future.
– Dismiss – hide this anomaly.
– Delete – permanently remove the anomaly record.
– Add note – attach comments for your team (e.g., “Linked to deployment X”).

2.3 Enhanced anomaly analysis – Data Analysis tab

The Data Analysis tab exposes how the model “saw” this anomaly.

These are the feature dimensions used by the Isolation Forest, which can be adjusted in the anomaly setting \ configuration.

Hour of day

What it is: The hour when the anomalous sample occured.
Why it matters: Workloads are different at 03:00 vs 14:00.

The model learns typical behavior per hour and uses this as context.

Example: Batches/sec might be high at 10:00 (peak business hours) but low at 01:00. A value of 5,000 batches/sec at 01:00 is far more suspicious.

Is above threshold

What it is: Boolean indicator: is the metric currently above the configured threshold or percentile.
Why it matters: Ensures the anomaly is not only “different” statistically, but also above a practical operational threshold.
Example: CPU may spike from 5% to 40% (large change) but still be below the “CPU above 80th percentile” rule, so it might not be considered severe.

Last values all above threshold

What it is: Boolean indicator: have the last N data points all exceeded the threshold of “is above threshold”.
Why it matters: Helps detect persistent problems rather than one-off spikes.
Example: Disk Queue Length is above the threshold for five consecutive samples; the feature becomes TRUE, and the anomaly is more important.

Avg last values

What it is: Running average of the last N values before and including the anomaly.
Why it matters: Captures sustained high sikes.
Example: Average CPU for the last 10 samples is 85%. Even if the exact anomaly point is 90%, the average indicates prolonged stress.

Std last values

What it is: Standard deviation of the last N values.
Why it matters: Shows how stable or noisy the metric has been.
Example: If Disk Read Latency fluctuates a lot, a single high value might be less suspicious than a spike in a normally stable metric.

Has large change from previous

What it is: Boolean –the value has jump sharply compared to the previous sample.
Why it matters: Detects sudden incidents even when absolute levels are not extremely high yet.
Example: CPU jumps from 20% to 80% within 2 minutes → flagged as a large change.

Distance from last high value

What it is: Distance between the current value and the last recorded peak.
Why it matters: Helps distinguish new extreme peaks from normal oscillations around a previous high.
Example: Current Disk Bytes/sec is far above any previous high of the day, which strongly suggests an anomaly.

Is weekend

What it is: Boolean – is this sample on a weekend or holiday (according to your configuration).
Why it matters: Baselines may be very different on weekends or holidays, especially for business workloads.
Example: On Sunday the system is usually idle; high activity may be suspicious or a special event.

3. Configuring anomalies

Configuration is done under Settings → Anomalies.

3.1 Templates vs Custom setup

There are two levels of configuration:

3.1.1 Templates

Under Templates, Experda provides default anomaly configurations, such as:

Host
– Anomaly OS Default – covers CPU, Disk Bytes/sec, Disk Queue Length, etc.
SQL
– Anomaly SQL Default – covers Batches/sec, Compilations/sec, Distinct client computers, and other SQL metrics.

You can:

Use them as-is for quick onboarding.
Edit thresholds and dimrnsions.
Attach templates to groups of servers via Affiliated server(s).

3.1.2 Custom server setup

Under Custom setup, you can override template settings per server

Select a specific server (e.g., 168.15.50).
Configure Host and SQL anomaly rules just for that server.
Enable/disable individual metrics (CPU, Disk Queue Length, etc.).

This is useful when:

Some servers handle batch workloads with different patterns.
Test/dev servers should be monitored differently from production.
Certain metrics are irrelevant for a given server.

4. Rule details – dimensions & examples

Each metric rule (CPU, Batches Per Sec, Disk Queue Length, etc.) has a Rule Details page.
Below are the main configuration dimensions and how to use them. Here are the dimensions for the CPU anomaly detection rule for example.

4.1 Severity

Choose Information, Warning, or Critical.

Information: used for non-urgent deviations (e.g., temporary increases in Batches/sec).
Warning: used for cases that may impact performance.
Critical: used for conditions that may lead to downtime or data issues.

Example:

CPU above 80th percentile for 10 minutes → Warning
CPU above 95th percentile for 30 minutes + disk queue anomalies → Critical

4.2 Metric threshold: “X above”

For CPU

CPU above – a percentile-based threshold (e.g., 80).
Meaning: current CPU must be higher than 80% of historical values to contribute to an anomaly or to be considered for an anomly.

Example:

CPU above = 80

If the 80th percentile is 65%, then current CPU must exceed 65% to qualify.

For Batches Per Sec (and similar metrics)

Batches per sec above – percentage compared to learned
For example, 50% means “current value must be at least 50% higher than the learned baseline” or above the 50th percentile (depending on how you calibrated it).

Example:

Normal Batches/sec ≈ 1,000.
Batches per sec above = 50% → anomalies start around 1,500+.

4.3 Running average window (“Running X avg of”)

What it is: number of samples used to calculate the running
Example fields:
– Running CPU avg of: 5
– Running Batches per sec avg of: 10

Example:

If samples are collected every 2 minutes and Running CPU avg of = 5, the model uses the last 10 minutes of CPU data to compute the context average.

Use smaller values for fast-changing metrics; larger values for smoother, long-term trends.

4.4 Consecutive high events

Field:
– Consecutive high Batches per sec of
– Consecutive high CPU events

What it is: how many samples in a row must be “high” before an anomaly is raised.

Example:

Consecutive high Batches per sec of = 5
With 2-minute sampling, Experda requires 10 consecutive minutes of high values.

This prevents false positives from one-off spikes.

4.5 Weekend / holiday effect enabled

Toggle: Weekend / holiday effect enabled
What it does: tells the model to treat weekends and configured holidays differently when learning normal ranges.

Example:

On weekends, CPU usage is usually below 10%. A spike to 60% on Sunday is much more suspicious than the same value on Monday. Enabling this effect helps the model reflect that difference.

Turn this off if the server has 24/7 uniform workload with no weekday/ weekend difference.

4.6 Large change from previous chronological sample

Toggle: Large change from previous chronological sample
What it does: emphasizes sudden jumps or drops in the feature Example:

Disk Queue Length jumps from 1 to 40 within a single interval. Even if 40 is not “extreme” compared to historical peaks, the sudden jump is itself anomalous.

Useful for catching incidents at their onset, before long-term averages adapt.

4.7 Hour of day effect enabled

Toggle: Hour of day effect enabled
What it does: allows baselines to be different per hour of the Example:

Nighttime jobs may run at 02:00 every day with very high Disk Bytes/sec. With the hour-of-day effect on, the model learns 02:00 separately and won’t flag those jobs as anomalies.

Turn this off for servers with very flat usage across the day, where hour-by- hour patterns are not relevant.

4.8 Use distance from last high

Toggle: Use distance from last The longer time a high value was not present, the larger the impact is when it does.
What it does: adds the “Distance from last high value” feature into the anomaly calculation. If a high value event did not occur for a long time this dimension will be activated.

Example:

Disk Read Latency reaches 10 ms while previous maximum for the day was 2 ms – distance is large, so the anomaly score increases.

4.9 Standard deviation over last N samples

Field: Std dev over last [N] samples (with a toggle)
What it does: tells the model to explicitly consider short-term variability. Example:

Std dev over last 20 samples enabled.
If Batches/sec is normally stable (std dev small), any sudden increase in variability itself may be anomalous.

Use this when you care about noise and instability, not just raw levels.

4.10 Absolute CPU filter (CPU >)

Field: CPU > (e.g., 100)
What it does: an absolute guardrail: only consider anomalies when CPU exceeds a given percentage.

Example:

CPU > 60 ensures no anomaly is raised when CPU is at 40%, even if the model thinks it’s unusual for that specific time.

This is useful when you want to avoid being alerted about deviations that are still operationally harmless.

5. Alerts and email digests

Per-rule email alerts

At the bottom of each Rule Details screen:

Email toggle – enable or disable email alerts for this rule.
Send to contacts – choose one or more named contacts from your Experda contacts list.
Send to groups – choose one or more groups (e.g., “DBA On-Call”, “Ops Team”).

When enabled, these recipients will receive alerts whenever a new anomaly is detected for that metric, server, and severity combination.

Example:
For the CPU rule on a production server, send Critical anomalies to the “DBA On-Call” group and Warning anomalies to both “DBA Team” and “Ops Team”.

5.1 12-hour anomaly email digest

In addition to per-rule alerts, Experda sends a periodic email digest:

Covers the last 12 hours.
Includes all anomalies that:
– Occurred in that period, and
– The recipient is registered to (via the rule’s contacts or groups).

The digest typically contains:

Summary counts per server and per metric.
Top critical anomalies with direct links to AI insights.
Short explanations (top contributing features) so you can quickly triage.

This helps busy teams avoid alert fatigue while still maintaining situational awareness across the environment.

You can set the anomaly digest interval in the global settings, under the anomaly category (see image below)

6. Best practices

6.1 Start with templates

Begin with Anomaly OS Default and Anomaly SQL Default.
Run them for a few days to allow the models to stabilize their baselines.
Review anomalies in AI Insights to see which are useful and which are noise.

6.2 Tune thresholds and dimensions

If you see too many minor alerts:
– Increase CPU above percentile.
– Increase Consecutive high events.
– Enable Weekend/holiday and Hour of day effects to add context.

If you are missing incidents:
-Decrease thresholds.
– Enable Large change from previous sample and Use distance from last high.

6.3 Use custom setup for special servers

Batch servers / data warehouses often have nightly spikes – tune their rules separately.
Dev/test servers might need Information-only severity or even disabled rules.

Knowledge Center

Key features

Client/Company

Managed Server

Agent

Multiple Agent Installations

OnPrem installation

Cloud installation (common)

Prerequisites: Hardware and software requirements

Step 1 – Sign up to Experda

Step 2 – Create a new company from the main dashboard

Step 3 – Download agent installation file

Step 4 – Install the agent on a dedicated machine

Step 5 – Add a new Managed Server [see next section]

Top Left Navigation Menu

Server Tree

The server

Alert Counter View

Paging Control

Timeframe Control

Timeframe view

Timeframe control Live view – timeline view

Timeframe control History view

Dashboard View – Default View

Dashboard View – Gallery View

Dashboard View – List View

Main Company’s Managed Server View

Main managed Server Table

Client Dashboard Additional Actions/Filtering Options

Server Dashboard – Part 1

Server Dashboard – Part 2 – Performance Overview

Server Dashboard – Part 2 – Performance Overview

Queries Grid

Queries Grid Main Table

Enhanced Charts

Enhanced Charts

Enhanced Chart Filter/Grouping

Compare Graphs Feature

CPU – Part 1

CPU – Part 2

Memory – Part 1

Memory – Part 2

Disks/Partitions – Part 1

Disks/Partitions – Part 2

Blocking – Part 1

Blocking – Part 2

Blocking by application

The deadlock chart

Temp DB – Part 1

Temp DB – Part 2

Main waits chart

Wait Category Grid

Wait Category Detailed Grid

Health Check Reports

Health Check Report Settings

How do I find the health check report settings?

Creating a new health check report template

Affiliating a server with a template

Customizing a server to exclude/include tests

Customizing a server to exclude/include tests

Suspicious Activity Settings

Main Suspicious Records Settings – Records View

Main Suspicious Records Settings – Whitelist View

Capacity Planning – Part 2

Percent Of Use Vs. Percentage Of Growth

Prediction Size From Chart

Prediction Grouping

Availability Group List Overview

Expanded Availability Group

Jobs Company Overview

Jobs Server Overview

Specific Job History

Understanding alert attributes

Alert Templates

Alert Dashboard View

Alert Drawer Display

Alert full view

Configuring alerts

Alert Configuration Views

Affiliated Server View