Feb 24 glitch: NSE admits to faulty infra

PALAK SHAH Updated - March 22, 2021 at 10:14 PM.

Blames hardware vendor; says it had made certain changes that were not appropriate

11/10/2019 MUMBAI: (file Photo) National Stock Exechange at BKC in Mumbai. Photo By. Paul Noronha

Did faulty trading infrastructure at the National Stock Exchange lead to a four-hour stock market disruption on February 24? For the first time, the NSE has revealed that its trading infrastructure, which is not as per the ‘required specifications,’ had triggered the tech glitch.

The NSE says that one of its hardware vendors had made certain changes to the exchange’s trading infrastructure that were “not appropriate for its set-up and were not communicated to it.” So far, NSE had blamed telecom service providers for the glitch.

Storage area network

“Incident analysis showed that the problem was caused by failover logic implemented by the vendor which did not conform to NSE’s stated design requirements, coupled with issues in the configuration done by the SAN (storage area network) vendor that triggered the failover logic. We note that the specific failure logic used by the vendor is not documented, was not communicated to NSE, and was not appropriate for NSE’s set-up. The resultant SAN failure led to the incident on February 24,” the NSE said in its statement.

The NSE has suffered more than a dozen tech glitches in the past 10 years but for the first time the exchange said that it had found the infrastructure had faulty designs.

BusinessLine reported on March 18 that NSE’s data centre located at Kohinoor City in Kurla, which is a few kilometres away from its main BKC data centre, was under the scanner for the tech glitch. The Kohinoor City data centre came up around 2010 when NSE was starting co-location trading and servers related to clearing and index calculations were shifted there to accommodate speed traders at the BKC site. The hardware failure that NSE has now revealed is also linked to issues at Kohinoor City site, sources said.

“On February 24, post link failure, we saw unexpected behaviour of the SAN system, with the primary SAN becoming inaccessible to the host servers. This resulted in the risk management system of NSE Clearing and other systems such as clearing and settlement, index and surveillance systems becoming unavailable,” the NSE said.

‘Should have been seamless’

NSE also says SAN is a fault tolerant system that was designed to function ‘seamlessly even in the event of telecom link failures between primary and near disaster recovery (NDR at Kohinoor City) site.’

“One of the features of SAN that was deployed in October 2020 was designed to provide not just zero data loss but also zero down time. Before deployment, the system was tested against various scenarios including link failures and functioned properly. However, on February 24, post link failure, the SAN system at the primary data centre stopped functioning, which was completely unexpected,” the NSE said.

Digging roads & interruption

The NSE has blamed digging on the roads and construction activity for failure of network connectivity links that led to links failure and subsequently affecting the hardware.

“Between our primary (BKC) and NDR (Kohinoor) sites, NSE has multiple telecom links with two service providers to ensure redundancy. On February 24, we had instability due to digging and construction activity along the path between the two sites. The replication to NDR is designed such that in the event of the links between primary and NDR getting cut, the primary continues operations without any direct effect. Post earlier link failures, operations continued without any interruption,” the NSE said.

The failure of NSE’s primary site and one at Kohinoor also did not allow the exchange to start trading from its emergency data centre at Chennai. SEBI has now asked NSE to fix individual responsibility for the disruptions.

Published on March 22, 2021 16:44