This is the final article in our three-part series on SCADA networks. Part 1 covered SCADA network equipment, and Part 2 covered SCADA networking protocols and basics. Let's wrap up the series with general SCADA networking and troubleshooting.
TCP/IP is a combination of two separate protocols: Transmission Control Protocol (TCP) and Internet Protocol (IP). It has to do with the transmission of data back and forth between networked entities.
As we touched on in Part 2, a protocol is a standardized set of rules that allows two or more networked entities to communicate. In order to send data back and forth successfully, the sender and receiver must agree on the protocol—the rules.
Before TCP/IP, computers from different vendors had difficulty communicating with one another. Each vendor had its own rules. The U.S. Department of Defense developed TCP/IP as a clear, universal standard that allows computers from different vendors to communicate.
In the TCP/IP model, the different communication tasks are divided into four levels or layers. Data must travel through all four layers before it reaches its destination. Each individual layer has its own responsibilities and deals with different protocols. One of the main advantages of this stack model is that it allows you to isolate network communication problems by layer and work in a methodical way to troubleshoot issues.
We touched on the four TCP/IP layers briefly in Part 2. Here they are for review:
Data originates from applications. In the case of a solar PV site, there are two ends to that application path—the field devices that capture data, and the SCADA applications that store that data and provide control.
The field devices (inverters, trackers, MET station sensors, etc.) have sensors that gather information. The SCADA system pulls in that data, where other applications receive it. The historian logs and stores the data on local or cloud-based servers, which allows operators and stakeholders to look at the historical data from the plant. This can be useful for troubleshooting. The HMI displays the data and allows operators to give commands to the plant controller.
In Part 1, we defined switches, routers and firewalls as key pieces of network hardware. Now, let's look at them in terms of the TCP/IP stack model.
Switches work at the lowest and most basic TCP/IP layer—the network access/physical layer. When two devices on the same local network (LAN) want to communicate via Ethernet, the switch creates a point-to-point connection and prevents data collision. It does this by checking the MAC address on each device that's connected to each port of the switch (MAC Table). That's it. Switches never go further than this basic physical layer.
Routers work on the next layer up—the Internet layer—which deals with IP addresses. They send data not between different devices, but between different networks. Each port on a router has an IP address that acts as a gateway to the network it's connected to.
When a router receives data based on source and destination IP, it investigates further. It checks the source and destination IP against its routing table, and routes the data to the interface that has the fastest and closest path to the destination network.
Routers can also provide access control. An access control list (ACL) prevents access to the network based on source/destination IP addresses and application port numbers monitoring. This is called stateless access control because it's simple—it doesn’t do much investigation. The router simply filters traffic based on the source and destination IP and port.
Firewalls operate on a higher level than routers. Modern firewalls do everything routers do, but they have more applications, processing power and security capabilities. They do more investigation into the data and can monitor, filter or block traffic based on more advanced parameters than just source and destination IP and port. This is called stateful access control.
SCADA system alarms, displayed on the HMI screen, alert you when devices stop communicating on the network. This "device down" alarm may be depicted a number of different ways depending on your SCADA alarming software—it might be a link, a notification bubble, or an animation that changes from green to red.
It helps to gather some basic information before starting the troubleshooting process. How long has the device been offline? If the device was never up, there might be something wrong with the logic embedded in it. If the device was up but now has stopped communicating, there might be a network-related or device-related issue. This information may not be 100% accurate (operators may not know all the details) but it can provide some helpful background.
There is no one set sequence of steps that works for absolutely every troubleshooting situation. However, the overall process is the same—to try to narrow down the possibilities in the shortest, quickest way possible. SCADA troubleshooting ability grows with experience and knowledge of the system.
Here are the typical steps to troubleshoot a communication problem:
This concludes our article series on SCADA networking. We hope you've found it helpful and valuable!
If you are a solar industry professional who wants to learn more about SCADA networks, we invite you to attend our quarterly Solar PV Operations Training sessions. SCADA networks are covered in-depth on Day 2 of the 3-day training. The training is system agnostic, meaning it can be applied to any SCADA system platform.