The troubleshooting process for a SCADA system largely depends on the preferences and specifications of the operators and asset owners. Some wish to be as hands-on and self-sufficient as possible, while others want to focus solely on the big picture of operations and rely on troubleshooting and consulting services.
Whether you fall into the first or second group, it’s good to have an understanding of what SCADA troubleshooting entails. We’ll cover the basics of system alarms and SCADA troubleshooting steps in this article.
If we can assume there is a basic understanding of the system then properly configured alarms can be our window into the issue.
System alarms can be categorized by their origin and their severity. Your SCADA subcontractor can take you through the alarms on your specific system during the project handoff.
Main SCADA system alarms include power supply notifications (activation of the SCADA UPS and backup power supply) and network issues such as loss of IP connection.
The most prevalent SCADA alarm is ‘Device Down.’ This alarm occurs when a device is no longer communicating on the network. Since the solar site is typically based on network protocols and connections we will go over simple steps for verifying if a device is off network OR has just stopped communicating.
1. ‘Ping’ the device. You can do this by opening a command prompt (DOS prompt for old timers) and testing—for example, a command line would be: Ping 10.10.10.1
If the ping returns a successful ping you can confirm the device is still on network but not communicating with SCADA.
2. Test protocol communication. For example, if the device is an inverter or tracker and communicates via Modbus TCP/IP, run a Modscan tool (eg Modscan) and test for data. If you get either exception errors or failed protocol connection then you know the end device is not talking to SCADA even though it IS on the network.
Plant controller alarms notify you of problems related to operation, generation, compliance (eg overproduction, underproduction, low DC/AC voltages) and possibly financial excursions (eg under performance due to temperature or lower than expected/normalized DC values).
The plant controller alarms can cover a broad range, but listed below are some of the most serious ones that could have a direct impact on owner operations and finances.
The AVR controller operates the solar PV site’s inverters in order to accomplish a voltage setpoint required by the utilities and ISOs. A specific voltage may also be part of the site’s Power Purchase Agreement (PPA) and Interconnection Agreement (IA). Imbalances or discrepancies are directly tied to the grid and require immediate attention.
Going out of the specified voltage range can also have negative ramifications for the site itself. If the plant starts producing or absorbing excess reactive power while trying to control voltage, there is a risk of low side equipment fault (eg overvoltage - undervoltage).
In troubleshooting AVR alarms, you need to verify that nothing on the grid has changed (eg switching of radial lines, shunt capacitor/reactors placed in or taken out of service, etc). If changes haven’t been made to the grid configuration, verify any firmware updates to inverters. Finally, take a close look at Power Plant Controller (PPC)—has anyone accessed/modified the configuration? Are the process (PV), setpoint (SP) and PID output (CV) scaled correctly? Are the PF limits enabled?
Similar to AVR control, Active Power Control aims to regulate the amount of active power (measured in Megawatts or Kilowatts) at which the plant should run. If the plant is not running at the specified MW or KW, that triggers an alarm.
These alarms are important as many solar PV plants today are overbuilt—capable of producing more power than they’re rated for and that’s allowed in their PPA. If these plants go above their maximum set point for their PPA, the alarm is the signal to the operator to take a deeper look.
While not as impactful on the grid as voltage regulation, overproducing can cause contractual issues for operators and owners. The same basic steps would be applied to this control function as well—verify system changes, inverter changes and PPC changes.
Substation and control house alarms involve communications to the protection devices, revenue meters and building environment. Also included in this group are breaker status alarms and transformer alarms for oil level, winding temperature, and liquid temperature. It is extremely important to ensure that the relay alarm bits are properly mapped during commissioning so that operations can verify relay, issue and possible cause prior to technician mobilizing to site.
These alarms involve the PV arrays, inverters, weather stations and other field equipment. Among these, tracking system alarms and weather station alarms are the most common.
Inverter alarms alert operators of any faults, warnings or errors, including voltages, currents, and frequency. Transformer alarms monitor temperature, pressure, and liquid levels. Tracker alarms trigger when the system is not at setpoint or other malfunctions occur. There are also alarms to notify operators of loss of communication to any piece of field equipment.
The exact troubleshooting process will largely depend on the specifications of the operators, owners and O&M contractors.
For example, at Nor-Cal Controls, we have SCADA system clients who want only basic troubleshooting training along with the logins for their system, then contract with us for troubleshooting services during the operational phase. They prefer not to dig into the technical aspects of the system. Other clients want to be as self-sufficient as possible in troubleshooting their systems and resolving issues, and instead come to us for in-depth training for their teams. There is no right or wrong way—it’s all up to client preference.
Though the specific troubleshooting process will vary, there are widely agreed upon principles and techniques for troubleshooting that can be distilled into the following bullets:
Let’s take a look at a typical SCADA troubleshooting process.
While all alarms should be addressed, not all signal an immediate risk of production loss and/or non-compliance. Alarms are categorized as High, Medium or Low Priority—how exactly each alarm is assigned will vary by system. Your SCADA subcontractor can cover the different priorities and how to react to each. Here are some examples of alarm priorities in a typical SCADA setup.
High Priority: Warning of potential equipment failure, loss of production, and deviations from the specified voltage and power, due to a serious fault. Requires immediate attention.
Medium Priority: Maintenance alarms not requiring immediate attention, but will need attention soon.
Low Priority: Informational alarms not requiring immediate attention.
Your SCADA subcontractor can cover exactly what to do in order to troubleshoot remotely. This includes logging into the SCADA servers, Historian servers, data concentrators and PLC (Programmable Logic Controller). This should give you a good starting point on how to troubleshoot alarms and investigate the root cause.
The root cause can be anything from OS lockup which can be corrected by simply resetting the equipment or power cycling equipment, to serious issues that could potentially be related to high voltage/current equipment and could result in long-term problems for the plant. Often, there’s a chain reaction of things that can occur at once. For example, an equipment fault can cause downstream or upstream faults as well.
Problems can often be investigated remotely, but not always. After remote troubleshooting efforts have been exhausted, it’s time for boots-on-the-ground technicians to take a look onsite. Broken network connections and equipment failures are common reasons to dispatch onsite technicians.
When all other troubleshooting efforts have been exhausted and/or techs have determined that everything’s fine in the field, it’s usually time for the SCADA subcontractor to get involved.
A common reason for SCADA subcontractor involvement is a software or hardware issue. If needed, the SCADA subcontractor can get in touch with the OEM manufacturer to resolve the problem. Field wiring issues can also fall under the purview of the SCADA subcontractor. For example, if a weather station sensor stops communicating, the SCADA subcontractor can work with the onsite team to verify whether a sensor needs to be rewired or replaced.
Putting off a critical alarm can lead to:
All three of these ultimately result in a loss of revenue. If your equipment fails, it falls outside of warranty. You may have to replace system equipment. You may be fined for compliance issues. In short, your investment in the solar PV site completely goes down.
As we discussed earlier, the amount of initial troubleshooting training for your SCADA system varies depending on the specific project and the preference of the asset owners and operators. How hands-on or hands-off do you want to be with the system? At Nor-Cal Controls, we have clients who prefer an online seminar just on the alarm and troubleshooting basics, all the way up to comprehensive two-day onsite trainings for entire teams.
But what if you want additional training after the project handoff? Perhaps you initially wanted to rely more on contracted troubleshooting services, but have since expanded your team and have the capacity and desire to handle troubleshooting internally. Are there options for training?
Yes.
Nor-Cal Controls offers an advanced two-day SCADA troubleshooting course for EPC contractors, operators and solar industry professionals. The course is individually tailored to meet the needs and desires of the group requesting training. Students will develop the confidence and knowledge necessary to accurately and quickly determine problem areas and put forth solutions to keep plants up and running.
The training topics and principles covered can be applied to any SCADA system platform (hardware/software). For more information on our SCADA troubleshooting courses, visit our training page.