Weekend commuters got a taste of a possible worst-case transit scenario when the entire BART system was shut down for most of the morning on Saturday, March 9th. While initial reports hinted the outage was a result of either software or human error, recent cyber-attacks against local transit systems have raised speculation over other causes.
Partial service was restored shortly after 9:00 AM, with service fully restored by 11 AM. BART organized a bus bridge, with help from Muni and SamTrans, to pick up the slack during the outage.
According to BART’s most recent statement on the matter, the outage was caused by a computer network failure which prevented power from activating the agency’s train routing system. “The failure was software related at one switch that is part of a complex computer network,” the release states.
Earlier statements from BART noted that maintenance work was being done on the system’s uninterruptible power supply close to the time of the outage, but that it had since been determined that this was not the cause of the outage: “BART staff is waiting for failure analysis results from Cisco to understand the exact cause of the failure. Once we understand the exact cause we can determine any next steps needed.”
A KTVU television report raised the possibility of a cyber-attack as being one of many conceivable causes of the outage. In 2011, persons identifying themselves as members of the hacktivist group "Anonymous" claimed credit for cyber attacks against a public facing website operated by BART for marketing purposes, as well as a website run by the BART Police Officers Association.
The intrusions, which targeted personal information data which was then released to the public, was in response to the agency’s shutting down of Wi-Fi and cellular service in underground stations in order to disrupt protests against BART police shootings.
Meanwhile, in November 2016, San Francisco's Muni system was hit with a ransomware attack which affected fare terminals. The perpetrator in that attack demanded a ransom of approximately $73,000, but was apparently hacked himself by a security researcher. A similar attack took place last year in Sacramento.
A recent report in Security Week described the vulnerability of mass transit systems, due to the many generations of automation technology involved in their operation:
“It was revealed by a Department of Homeland Security report, that there is elevated risk in transportation due to the aging infrastructure used across the industry. These legacy systems are not limited to SCADA. The industry as a whole has made the move towards network-enabled “intelligent public transport” (IPT) but has simultaneously been slow to phase out aging systems. Additionally, mass transit systems rely heavily on networked devices for positioning, routing, tracking, access controls, navigation, and more. These devices provide the benefit of faster, more automated transit systems, but must also be recognized as additional system access points that require oversight and protection.”
That said, the timeframe of Saturday’s failure means that a cyberattack is less likely in this case.
“it’s not impossible, but isn’t likely. A bad actor isn’t going to make a mistake with timing,” says Larry Bivens, Security Operations Manager with Rendition Infosec, an information security incident response and training firm. “They’ll want to maximize the impact of any attack.” In other words, a cyberattack would be far more likely to happen during the busiest - as in worst possible - time for the system.
Answers to the speculation finally came at the BART Board of Directors’ meeting the following Thursday.
Directors heard from BART system managers, including Chief Engineering Officer Tamara Allen, who described the type of failure which occurred on Saturday as "very rare."
"The last time we had a network switch failure of this kind which resulted in major service disruptions occurred in March 2006 during an afternoon commute,” Allen said. “In that case the failure was triggered by human error during a software update. This weekend's failure was due to a failure of the switch itself – a component failure.” Allen continued to describe how “Instead of processing and passing on data, the switch kept recycling data, generating an unmanageable data spike” and ended up overwhelming the system. ll of BART's field network sites had to be manually rebooted because of the failure, requiring a significant amount of manpower.
Allen then outlined two ongoing major efforts by the BART system to ensure that similar failures will not happen again, which she described as being "accelerated in the wake of this particular failure."
These include upgrading computer hardware and software in line with current and emerging standards in data management security, as well as, perhaps most importantly, finally installing a remote redundant disaster recovery facility which would allow the BART system to keep running in the event of a primary system failure.
Allen said that facility was expected to be "fully built out within a month, and fully operational within a couple of months."
Director Janice Li, while thanking BART staff for their quick work in ameliorating the crisis, continued to voice concerns over restoring public confidence in the system with greater transparency:
"We’re really fortunate that this happened on a Saturday morning, rather than on a weekday where we provide 400,000 trips per day,” Li said. “It's really important that riders understand what's happening. I think that time and time again, the lack of information coming from BART as an agency is really troubling, so I'm glad there was a press statement on Saturday, and that there continues to be communication on the matter."
Board President Bevan Dufty closed the report session with his comments: "[I’m] sure that we will be on a path where very quickly we will have a redundant system so that what happened on Saturday will not happen again."
Join us for our next BeaconTalk: Affordable housing finance expert Fay Darmawi answers California's multi-billion dollar question: How do you pay for affordable housing?