dApps and Security

A major concern about blockchains are their security level.
The type, parameters, implementation and other factors of the consensus algorithm determine the weak and strong points of the chain. But most often people overlook or underestimate an other threat: The security issues that arise from bugs in the code of smart contracts are much more often and more serious.

In fact, until today, there are not many known successful attacks on blockchains, with the exception of one or two reversion of blocks in chains which run under proof-of-work consensus and had low hash power.

By “successful” here we mean attacks which lead to theft (or at least destruction) of assets. For security holes that arise from smart contracts we can not mention them here, as they are endless and each month many more cases are revealed or exploited.

So we found the people that are guilty: the smart contract developers! But before we convict them it would be wise to see why this is happening. Why so many mistakes and bugs. This is the topic of the article.

Is programming hard? Let’s see the optimistic opinion first. Hilbert, not referring to programming but to mathematics and logic in general stated that “…there is no unsolvable problem”. Entering the programming’s realm, we can see books titled like “Learn the X language in 24 hours (or days or put anything else less than a month). This is marketing of course, but I guess that they know what they are doing: many people are optimistic in nature and there are sayings like “Nothing is impossible”, “Your only limit is yourself” and many more.

We know that things are not so simple. The above common sayings may apply to everyday life in some degree, but we are talking about programming here. Peter Norvig , a director of research at Google, countering the “learn programming in very short time” logic, wrote a webpage with the title “Teach Yourself Programming in Ten Years”, showing how difficult programming is. The fact is that there are many comics, jokes, quotes and (real) horror stories of how difficult programming is.

But this is not the whole story. Let’s take a breath because reality is far worse. Hilbert , after stating in a public speech that “…there is no unsolvable problem”, got refuted the next day by Gödel. He revealed the first of his incompleteness theorems. More precisely, he proved that any computable axiomatic system that is powerful enough to describe the arithmetic of the natural numbers cannot be complete if it is consistent and vice versa. And secondly, the consistency of axioms cannot be proved within their own system. Consistency here means that axioms cannot lead to contradiction, which makes the system completely useless. Completeness means that for every proposition inside the system, we can prove (or disprove) that it is true. More generally and in other words: for any useful system (consistent) there are always propositions that they are true, but there is no way to prove it using the existing axioms. If you extend the system, adding more axioms, then you may prove these propositions. But this extension will create another propositions that are true but cannot be proven; ad infinitum. So, the system of axioms always stays incomplete. An example of this is in Euclidean geometry: you can square a cycle or trisect an angle, but outside of the rules. You cannot do it by Euclidean rules, that is, using only a straightedge and a compass. Another example is, probably, Goldbach’s_conjecture. Notice the word “probably”. We cannot be sure that the proposition “Every even integer greater than 2 can be expressed as the sum of two primes” holds, as there is no mathematical proof for it. Maybe this proposition can be proven; or maybe dis proven; or maybe is true but can’t be proven (incompleteness theorem). If it is last case, then we will never know that the proposition is true. Maybe it is or maybe it is not but no one has found the proof yet.

The incompleteness theorems have tremendous impact on computer science. John von Neumann , the “father” of today’s computers, commented about this: “Kurt Gödel’s achievement in modern logic is singular and monumental – indeed it is more than a monument, it is a landmark which will remain visible far in space and time”.

But why we should be interesting in the above? Well, six years later (1936) Alan Turing gave an answer to the halting problem. Before we give the answer of Turing and explain the halting problem we must refer to Turing machines. A Turing machine is a (imaginary) machine (of course) which manipulates symbols on a strip of tape according to some predefined rules. A Turing machine can simulate any computer program. Now, let’s set a question: Is there an algorithm than can determine if a computer program halts (or does not halt) for an arbitrary input? That is, the input is not known in advance. Someone who can provide such an algorithm solves the halting problem. Alan Turing proved that such an algorithm cannot exist; that is, there is no algorithm to decide for sure if a computer program halts for any input or runs forever (infinite looping). The halting problem is undecidable.

We can go farther and talk about Rice’s theorem, Entscheidungs problem (Church’s theorem), Kolmogorov complexity (algorithmic entropy), Post’s theorem, etc., but we are going to be carried away. All the above have been referred to see what practical consequences they bring to programmers and security. If a programmer is coding in a Turing-complete language he can never be sure that for any input (which possibly depends on the user who will run the code or call the smart contract) the program halts.

What is a Turing complete-language?

Any language that can do a conditional or unconditional jump before or after the memory place of the instruction is a “Turing-complete language”.

Programmers think it like “up-down” as they read or write the code vertically. A Turing machine property is described as “left-right”, as the tape position can jump left or right. But is exactly the same thing: it is non-unidirectional. All well-known and popular languages are Turing-complete.

So, it would be wise to separate idealism and real world. Let’s as answer to the question “Is programming hard?” confidently with “yes”, because there is mathematical proof about the answer.

Solidity, the widest used language for writing smart contracts is Turing-complete. Consequently, it is encumbered with the halting problem’s undecidability. For example, a function can be “trapped” in an infinite loop consuming all available gas (that’s why there is a gas limit, to protect from such cases). Solidity has many other pitfalls: integer overflow, integer underflow, call stack exploits, reentrancy, float precision problems, wrong visibility for variables and many more. Even worse, when inline assembly for EVM is used then anything can happen.

I hope that the reader convinced that the verdict “not-guilty!” for the developers of smart contracts is fair. The problems is that the real issue is not about the trial of the programmers, but how we can improve the security of dApps. In other words, what a blockchain can do to reduce the probability of an insecure smart contract to be deployed (because if it is deployed then there is no way back).

What if we restrict the language to “jump” unidirectional? That is, only moving “right” (or “down”). In this occasion, the language becomes non-Turing complete. Because the machine has a limited amount of memory, in this mode any program is guaranteed to halt. Bitcoin script is non-Turing complete. There is a reason for that, Satoshi was obsessed with security and he knew that a non-Turing-complete script gives better security.

The next question is if it’s possible to modify Solidity language to be non-Turing-complete – we care about Solidity language as ECOchain uses it for smart contracts. The answer to this question is “yes”. And the more obvious question is, if it is affordable why no one did it and why we still use solidity? Can non-Turing-complete languages erase the plague of security problems for programmers?

Unfortunately, things are not so simple. Using non-Turing-complete languages is much harder than Turing-complete. But most important of all, non-Turing-complete languages can’t do anything. Turing-complete languages can. And we must not forget that many security problems do not derive from the halting problem. So, Turing-complete programming cannot provide a universal solution.

In our opinion, non-Turing-complete mode can increase security. The developers should have the option to decide if they want Turing or non-Turing-complete mode. Most things (but not all) can be done in non-Turing-complete mode. Self-restriction has many benefits in security, as it helps avoiding programming bugs. For things that are very hard or impossible to be done in non-Turing-complete mode the programmer can use the Turing-complete option.

A carefully designed decentralized application can contain smart contracts in Turing and non-Turing-complete mode.

Our plan is to develop a subset of solidity, which can be activated at will when the smart contract starts. Actually, we are already working on that. Also, we must counter the security problems that do not arise from the halting problem. So, we design a subset of solidity, namely eco-solidity, who has restrictions such as:

No recursiveness
Bound looping (looping is possible but there is a maximum number of iterations)
No option to use assembly
Call graph analysis for the compiler
Remove the ability for the array length to be modified
Adding more warnings related to security at compiling process
Various other modifications that are very technical to be described here

We also take another measure to protect smart contract coding: encouraging good practices by standardization. We believe that standardization of smart contract programming is low today. While we cannot enforce the developers, we can help them use safe practices. So we plan to inject standard code inside the genesis block, which will be able to be used (for example safe math library, interface for tokens, etc.) in smart contracts. Developers can be sure that the standardized code is safe.

But how can they be sure?

Because the code will have formal verification that contains safety proof. The concept of formal verification is very large and is going to be explained in another article, but allow me to state this: formal verification is the hardest problem in computer science. Unfortunately, verifying a finite state machine (FSM) language is PSPACE-hard. For short programs, we can have (not always) formal verification. We are working on it. Delivering safe libraries who have formal verification is an ambitious goal, but it can be done. That way, we can be assured that the code is safe.

That is enough for an article. The takeaway is that programming is hard because of the nature of our universe. There will always be unsafe code. But there are ways we can follow to reduce bugs, raising the level of security for the dApps.