Checking Telegram Open Network with PVS-Studio

Picture 3


Telegram Open Network (TON) is a platform by the same team that developed the Telegram messenger. In addition to the blockchain, TON provides a large set of services. The developers recently made the platform’s code, which is written in C++, publicly available and uploaded it to GitHub. We decided to check the project before its official release.

Introduction


Telegram Open Network is a set of various services. Among other things, it provides a payment system of its own based on the Gram cryptocurrency, and a virtual machine called TON VM, which executes smart contracts. It also offers a messaging service, TON Messages. The project as a whole is seen as a countermeasure to Internet censorship.

The project is built with CMake, so I didn’t have any difficulties building and checking it. The source code is written in C++14 and runs to 210 thousand LOC:

Picture 5


Since the project is a small and high-quality one, there aren’t many bugs in it, but they still should be dealt with.

Return code

static int process_workchain_shard_hashes(....) {
  ....
  if (f == 1) {
    if ((shard.shard & 1) || cs.size_ext() != 0x20000) {
      return false;                                     // <=
    }
    ....
    int r = process_workchain_shard_hashes(....);
    if (r < 0) {
      return r;
    }
    ....
    return cb.store_bool_bool(true) && cb.store_ref_bool(std::move(left)) && 
            cb.store_ref_bool(std::move(right)) &&
            cb.finalize_to(branch)
               ? r
               : -1;
  ....
}


PVS-Studio diagnostic message: V601 The 'false' value is implicitly cast to the integer type.

It looks like the function returns the wrong type of error status here. The function should apparently return a negative value for failure rather than true/false. That’s at least what it does further in the code, where it returns -1.

Comparing a variable with itself


class LastBlock : public td::actor::Actor {
  ....
  ton::ZeroStateIdExt zero_state_id_;
  ....
};

void LastBlock::update_zero_state(ton::ZeroStateIdExt zero_state_id) {
  ....
  if (zero_state_id_ == zero_state_id_) {
    return;
  }

  LOG(FATAL) << ....;
}


PVS-Studio diagnostic message: V501 There are identical sub-expressions to the left and to the right of the '==' operator: zero_state_id_ == zero_state_id_

TON follows a coding standard that prescribes that class members' names should end in an underscore. In cases like this, however, this notation may lead to a bug as you risk overlooking the underscore. The name of the argument passed to this function is similar to that of the class member, which makes it easy to mix them up. It is this argument that was most likely meant to participate in the comparison.

Unsafe macro

namespace td {
namespace detail {

[[noreturn]] void process_check_error(const char *message, const char *file,
                                      int line);

}  // namespace detail
}

#define CHECK(condition)                                               \
  if (!(condition)) {                                                  \
    ::td::detail::process_check_error(#condition, __FILE__, __LINE__); \
  }

void BlockDb::get_block_handle(BlockIdExt id, ....) {
  if (!id.is_valid()) {
    promise.set_error(....);
    return;
  }
  CHECK(id.is_valid()); // <=
  ....
}


PVS-Studio diagnostic message: V581 The conditional expressions of the 'if' statements situated alongside each other are identical. Check lines: 80, 84.

The condition inside the CHECK macro will never execute as it has been already checked by the previous if statement.

There’s also another error present here: the CHECK macro is unsafe since the condition inside it is not wrapped in a do {… } while (0) construct. Such wrapping is needed to avoid collisions with other conditions in the else branch. In other words, the following code wouldn’t work as expected:

if (X)
  CHECK(condition)
else
  foo();


Checking a signed variable

class Slice {
  ....
  char operator[](size_t i) const;
  ....
};

td::Result CellSerializationInfo::get_bits(td::Slice cell) const {
  ....
  int last = cell[data_offset + data_len - 1];
  if (!last || last == 0x80) { // <=
    return td::Status::Error("overlong encoding");
  }
  ....
}


PVS-Studio diagnostic message: V560 A part of conditional expression is always false: last == 0×80.

The second part of the condition will never execute because the type char is signed in this case. When assigning a value to a variable of type int, sign extension will occur, so its values will still lie within the range [-128, 127], not [0, 256].

It should be noted that char is not always signed: its behavior is platform- and compiler-dependent. So in theory, the condition in question could still execute when building on a different platform.

Bitwise-shifting a negative number

template 
bool AnyIntView::export_bits_any(....) const {
  ....
  int mask = (-0x100 >> offs) & 0xff;
  ....
}


PVS-Studio diagnostic message: V610 Unspecified behavior. Check the shift operator '>>'. The left operand '-0×100' is negative.

Executing a bitwise right shift operation on a negative number is unspecified behavior: it’s impossible to know in advance if the sign will be extended or padded with zeroes.

Null check after new

CellBuilder* CellBuilder::make_copy() const {
  CellBuilder* c = new CellBuilder();
  if (!c) { // <=
    throw CellWriteError();
  }
  ....
}


PVS-Studio diagnostic message: V668 There is no sense in testing the 'c' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error.

The message says it all: if memory allocation fails, the program will throw an exception rather than return a null pointer. It means the check is pointless.

Redundant check

int main(int argc, char* const argv[]) {
  ....
  if (!no_env) {
    const char* path = std::getenv("FIFTPATH");
    if (path) {
      parse_include_path_set(path ? path : "/usr/lib/fift",
                             source_include_path);
    }
  }
  ....
}


PVS-Studio diagnostic message: V547 Expression 'path' is always true.

This snippet is taken from one of the project’s internal utilities. The ternary operator is redundant in this case: the condition it checks is already checked by the previous if statement. It looks like the developers forgot to remove this ternary operator when they decided to discard the use of standard paths (there’s at least no mention of those in the help message).

Unused variable

bool Op::set_var_info_except(const VarDescrList& new_var_info,
                        const std::vector& var_list) {
  if (!var_list.size()) {
    return set_var_info(new_var_info);
  }
  VarDescrList tmp_info{new_var_info};
  tmp_info -= var_list;
  return set_var_info(new_var_info);     // <=
}


PVS-Studio diagnostic message: V1001 The 'tmp_info' variable is assigned but is not used by the end of the function.

The developers were apparently going to use a variable named tmp_info in the last line of this function. Here’s the code of that same function but with other parameter specifiers:

bool Op::set_var_info_except(VarDescrList&& new_var_info,
                        const std::vector& var_list) {
  if (var_list.size()) {
    new_var_info -= var_list; // <=
  }
  return set_var_info(std::move(new_var_info));
} 


Greater or less than?

int compute_compare(const VarDescr& x, const VarDescr& y, int mode) {
  switch (mode) {
    case 1:  // >
      return x.always_greater(y) ? 1 : (x.always_leq(y) ? 2 : 3);
    case 2:  // =
      return x.always_equal(y) ? 1 : (x.always_neq(y) ? 2 : 3);
    case 3:  // >=
      return x.always_geq(y) ? 1 : (x.always_less(y) ? 2 : 3);
    case 4:  // <
      return x.always_less(y) ? 1 : (x.always_geq(y) ? 2 : 3);
    case 5:  // <>
      return x.always_neq(y) ? 1 : (x.always_equal(y) ? 2 : 3);
    case 6:  // >=
      return x.always_geq(y) ? 1 : (x.always_less(y) ? 2 : 3);
    case 7:  // <=>
      return x.always_less(y)
                 ? 1
                 : (x.always_equal(y)
                        ? 2
                        : (x.always_greater(y)
                               ? 4
                               : (x.always_leq(y)
                                      ? 3
                                      : (x.always_geq(y)
                                            ? 6
                                            : (x.always_neq(y) ? 5 : 7)))));
    default:
      return 7;
  }
}


PVS-Studio diagnostic message: V1037 Two or more case-branches perform the same actions. Check lines: 639, 645

If you read carefully, you noticed that this code lacks a <= operation. Indeed, it is this operation that case 6 should be dealing with. We can deduce that by looking at two spots. The first is the initialization code:

AsmOp compile_cmp_int(std::vector& res, std::vector& args,
                      int mode) {
  ....
  if (x.is_int_const() && y.is_int_const()) {
    r.set_const(compute_compare(x.int_const, y.int_const, mode));
    x.unused();
    y.unused();
    return push_const(r.int_const);
  }
  int v = compute_compare(x, y, mode);
  ....
}

void define_builtins() {
  ....
  define_builtin_func("_==_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 2));
  define_builtin_func("_!=_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 5));
  define_builtin_func("_<_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 4));
  define_builtin_func("_>_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 1));
  define_builtin_func("_<=_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 6));
  define_builtin_func("_>=_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 3));
  define_builtin_func("_<=>_", arith_bin_op,
                      std::bind(compile_cmp_int, _1, _2, 7));
  ....
} 


The define_builtins function, as you can see, contains a call compile_cmp_int for the <= operator with the mode parameter set to 6.

The second spot is the compile_cmp_int function itself, which lists the names of operations:

AsmOp compile_cmp_int(std::vector& res, std::vector& args,
                      int mode) {
  ....
  static const char* cmp_names[] = {"", "GREATER", "EQUAL", "GEQ", "LESS",
                                    "NEQ", "LEQ", "CMP"};
  ....
  return exec_op(cmp_names[mode], 2);
}


Index 6 corresponds to the LEQ word, which means «Less or Equal».

It’s another nice bug of the class of bugs found in comparison functions.

Miscellaneous

#define VM_LOG_IMPL(st, mask)                                       \
  LOG_IMPL_FULL(get_log_interface(st), ...., VERBOSITY_NAME(DEBUG), \
                (get_log_mask(st) & mask) != 0, "") // <=


PVS-Studio diagnostic message: V1003 The macro 'VM_LOG_IMPL' is a dangerous expression. The parameter 'mask' must be surrounded by parentheses.

The VM_LOG_IMPL macro is unsafe. Its second parameter is not enclosed in parentheses, which could potentially cause undesirable side effects if a complex expression is passed to the condition. But if mask is just a constant, this code will run with no problems at all. That said, nothing prevents you from passing anything else to the macro.

Conclusion


TON turned out to be pretty small, so there are few bugs to find there, which the Telegram developer team should certainly be given credit for. But everyone makes mistakes every now and then, even these guys. Code analyzers are powerful tools capable of detecting dangerous spots in source code at the early development stages even in the most quality code bases, so don’t neglect them. Static analysis is not meant to be run from time to time but should be part of the development process: «Introduce Static Analysis in the Process, Don’t Just Search for Bugs with It».

© Habrahabr.ru