August 20 12 1 day is the nightmare of knight capital. A seemingly simple and hard-to-find human error made 17' s efforts go down the drain. Software errors caused the company to lose $440 million in direct trading within 1 hour. As a result, Knight Capital was acquired by competitor Getco LLC in the following summer.
Knight Capital Group, founded in 1995, is a well-known securities company on Wall Street. The peak trading volume accounts for 0/7.3% of NYSE/KLOC and 0/6.9% of NASDAQ/KLOC.
In addition to the normal brokerage business, the company also provides customers with a trading platform system to serve high-frequency trading. This electronic trading platform driven by embedded quantitative model not only provides decision-making reference for customers according to market data and related information, but also helps customers to complete automatic high-speed order placing transactions through the exchange interface (the automatic order placing interface has been closed in China).
The highly programmed quantitative trading system is the basic platform for Knight Capital to serve customers efficiently and with high quality. To put it simply, customers who invest in securities can automatically do "buy" and "sell" transactions for you according to the set trading rules. The trading system is highly intelligent. For large orders, they will be split into small orders to buy different stocks. Let individual stocks not have too big price fluctuations.
Explain a term "dark pool" first. Suppose I have 20 million A shares in my hand and hope to sell them one day. If you sell directly at the market price of a stock, the stock price will definitely fall because of a large number of sales. Therefore, the concealment of the transaction is more important than the speed of the transaction. The mass investors who hang out through ordinary trading software can see that there are huge orders, and the final transaction price is definitely lower than the price I hang out. In order to reduce this negative impact, I will issue a "dark sale block" instruction to the broker. This instruction indicates that brokers send orders to many hidden customers, which are executed by counterparties. This "darkening" strategy gives brokers a vital advantage in anonymity and market shock. Similar to the "non-display" block trade on the floor. This will bring benefits to the stability of the stock price and promote liquidity.
2011June 10, NYSE requires brokers to support the retail liquidity plan (RLP), that is, the dark pool function.
20 12 In early June, NYSE was approved by the US Securities and Exchange Commission to provide the "RLP" function, and announced that it would launch the RLP function in August of 20 12. 1.
The time for dealers to debug and go online is only over 30 days.
Most of Knight Capital's clients are securities brokers, and a large number of financial service giants cooperate with them. The market asked them not to give up this trading market.
Knight Capital's software development team has only 1 month from development, testing to online. They are working with all their might. The core transaction module they need to modify this time is called SMARS (Intelligent Market Access Routing System).
SMARS executes thousands of orders per second and can display prices between dozens of different exchanges in milliseconds. It can receive orders from upstream users, split them and send them to the exchange for matchmaking.
However, when their trading system was upgraded to the previous version, there were still some refactoring codes and test codes. One of the order algorithm codes named "Power Peg" was a test program written by an engineer at that time, and the instructions executed by the program were the test strategy of buying high and selling low. This practice of keeping "dead code" in the system is very common in large systems.
I don't know if it's a document error or an engineer error. The start switch flag used in the RLP code modified by this upgrade is the same as the switch flag bit of the "power locking" algorithm. After the system was upgraded, the algorithm of buying high and selling low was activated.
With the approach of August 1, engineers manually deployed the new RLP code in SMARS to its eight servers one week before going online. At this time, the engineer made a fatal mistake and did not copy the new code to one of the servers. Their software upgrade has no corresponding audit mechanism, no automatic system reminder mechanism and no regression test. I surf the internet in a hurry.
1 In August, 2008, at 8: 00 am EST1,an internal system named BNET automatically generated 97 alarm emails and sent them to the engineers of Knight Capital. But these machine mails did not attract the attention of the staff. It also made Knight Capital miss the last chance to repair the system.
At 9: 30am, new york Stock Exchange started trading, the trading system began to receive information RLP orders from brokers, and SMARS distributed the received jobs to its servers. Seven servers using the new RLP code handled the order correctly. However, the order sent to the eighth server has a defective Power Peg code activated by the reuse flag. Regardless of the number of confirmations received by Knight from other trading places, the server starts to send sub-orders continuously for each received parent order.
The disastrous transaction began. In the 2 12 incoming parent orders processed by the defective Power Peg code, SMARS sends thousands of sub-orders every second, which will buy high and sell low, resulting in the execution of 4 million 154 shares in about 45 minutes, exceeding 397 million shares. Among them, 75 stocks, knight capital pushed up by more than 5%, accounting for more than 20% of the trading volume; The price of 37 stocks plummeted 10%, accounting for more than 50% of the trading volume of Knight Capital.
At 9: 34, the computer analyst of new york Stock Exchange noticed that the market trading volume was twice the normal level, traced the soaring trading volume back to Knight Capital, and immediately informed its chief information officer.
Knight Capital quickly called the company's top IT staff, and it took 20 minutes to find out the cause of the problem.
At 9: 50, the NYSE triggered a fuse mechanism and automatically suspended trading of multiple stocks.
At 9: 58, Knight Capital engineers identified the root cause and shut down SMARS on all servers. However, the damage has been done. Knight executed more than 4 million transactions in 154 stocks, totaling more than 397 million shares; It holds more than $3.5 billion in 80 stocks and about $365.438+0.5 billion in 74 stocks.
Knight Capital was shocked by this incident, and its share price dropped from $65,438+$00.33 in August to $2.58. In addition, heavyweight customers TDA Securities, Pioneer Fund and Wells Fargo Fund all announced that they would stop sending trading orders to Knight Capital. After-the-fact statistics show that Knight Capital bought about $7 billion in stock futures during the trading time of 65,438+0 hours. According to the securities trading rules, Knight Capital must pay a fee of $7 billion within three days. Of course, he can't afford it at all.
Knight Capital applied to the exchange to cancel these trading orders, but the chairman of the American Stock Exchange cancelled only six of them according to the regulations, and the others did not agree to cancel them.
If Knight Capital sells its shares at a lower price the next day, the market may melt again. In order to stabilize the market, Goldman Sachs agreed to spend $440 million to buy some positions purchased by BUG Software at Knight Capital at a discounted price.
A week later, Knight Capital received $400 million in capital support. The following summer, he was acquired by a competitor.
Many years later, when interviewed by the former CEO of Knight Capital, he still believed that Knight Capital was not a technology company, but an economic company that used technology. Obviously, he regards technology as an auxiliary function, not the core competitiveness of the company. Strategies for reducing catastrophic failures of multi-system coupled complex systems;
1, consciousness level
Catastrophic failure does not come from outside, but from internal technical failure or our wrong combination.
2. Tools and methods
If Knight Capital could strictly implement the methods of modern software development and operation practices, perhaps the incident would not have happened. For example, using version control, writing test units, code review, automated testing, automated deployment, distributed deployment process, risk management, etc.
3. Time management
Timetable is another reason why Knight Capital failed to provide RLP solution. IT project managers and CIOs should postpone overly aggressive delivery plans and use alternative phased plans to confront their business leaders. It is impulsive, naive and reckless to spend 30 days implementing, testing and deploying major changes in algorithmic trading systems, which are used to make the market worth billions of dollars every day.
4. Encourage different opinions
Even if the warning was issued, it was ignored, so that the last 1 hour was wasted to find the error. Enterprises must reward effective reward mechanisms and encourage the effective expression of different opinions.
One year later, on 20 13 years 1 0.6, 13, the US Securities and Exchange Commission fined Knight Capital120,000 US dollars for the illegal trading of1month.