PVS-Studio for Java hits the road. Next stop is Elasticsearch
The PVS-Studio team has been keeping the blog about the checks of open-source projects by the same-name static code analyzer for many years. To date, more than 300 projects have been checked, the base of errors contains more than 12000 cases. Initially the analyzer was implemented for checking C and C++ code, support of C# was added later. Therefore, from all checked projects the majority (> 80%) accounts for C and C++. Quite recently Java was added to the list of supported languages, which means that there is now a whole new open world for PVS-Studio, so it’s time to complement the base with errors from Java projects.
The Java world is vast and varied, so one doesn’t even know where to look first when choosing a project to test the new analyzer. Ultimately, the choice fell on the full-text search and analytical engine Elasticsearch. It is quite a successful project, and it’s even especially pleasant to find errors in significant projects. So, what defects did PVS-Studio for Java manage to detect? Further talk will be right about the results of the check.
Briefly about Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine with open source code, capable of solving a growing number of use cases. It enables you to store large amounts of data, carry out a quick search and analytics (almost in real time mode). Typically, it is used as the underlying mechanism/technology, which provides applications with complex functions and search requirements.
Among the major sites using Elasticsearch there are Wikimedia, StumbleUpon, Quora, Foursquare, SoundCloud, GitHub, Netflix, Amazon, IBM, Qbox.
Fine, enough of introduction.
The whole story of how things were
There were no problems with the check itself. The sequence of actions is rather simple and didn’t take much time:
- Downloaded Elasticsearch from GitHub;
- Followed the instructions how to run the Java analyzer and ran the analysis;
- Received the analyzer’s report, delved into it and pointed out interesting cases.
Now let’s move on to the main point.
Watch out! Possible NullPointerException
V6008 Null dereference of 'line'. GoogleCloudStorageFixture.java (451)
private static PathTrie defaultHandlers(....) {
....
handlers.insert("POST /batch/storage/v1", (request) -> {
....
// Reads the body
line = reader.readLine();
byte[] batchedBody = new byte[0];
if ((line != null) ||
(line.startsWith("--" + boundary) == false)) // <=
{
batchedBody = line.getBytes(StandardCharsets.UTF_8);
}
....
});
....
}
The error in this code fragment is that if the string from the buffer wasn’t read, the call of the startsWith method in the condition of the if statement will result in throwing the NullPointerException exception. Most likely, this is a typo and when writing a condition developers meant the && operator instead of ||.
V6008 Potential null dereference of 'followIndexMetadata'. TransportResumeFollowAction.java (171), TransportResumeFollowAction.java (170), TransportResumeFollowAction.java (194)
void start(
ResumeFollowAction.Request request,
String clusterNameAlias,
IndexMetaData leaderIndexMetadata,
IndexMetaData followIndexMetadata,
....) throws IOException
{
MapperService mapperService = followIndexMetadata != null // <=
? ....
: null;
validate(request,
leaderIndexMetadata,
followIndexMetadata, // <=
leaderIndexHistoryUUIDs,
mapperService);
....
}
Another warning from the V6008 diagnostic. The object followIndexMetadata kindled my interest. The start method accepts several arguments as input, our suspect is right among them. After that, based on checking our object for null, a new object is created, which is involved in further method logic. Check for null shows us that followIndexMetadata can still come from the outside as a null object. Well, let’s look further.
Then multiple arguments are pushed to the validate method (again, there is our considered object among them). If we look at the implementation of the validation method, it all falls into place. Our potential null object is passed to the validate method as a third argument, where it unconditionally gets dereferenced. Potential NullPointerException as a result.
static void validate(
final ResumeFollowAction.Request request,
final IndexMetaData leaderIndex,
final IndexMetaData followIndex, // <=
....)
{
....
Map ccrIndexMetadata = followIndex.getCustomData(....); // <=
if (ccrIndexMetadata == null) {
throw new IllegalArgumentException(....);
}
....
}}
We don’t know for sure with what arguments the start method is called. It is quite possible that all arguments are checked somewhere before calling the method and no null object dereference will happen. Anyway, we should admit that such code implementation still looks unreliable and deserves attention.
V6060 The 'node' reference was utilized before it was verified against null. RestTasksAction.java (152), RestTasksAction.java (151)
private void buildRow(Table table, boolean fullId,
boolean detailed, DiscoveryNodes discoveryNodes,
TaskInfo taskInfo) {
....
DiscoveryNode node = discoveryNodes.get(nodeId);
....
// Node information. Note that the node may be null because it has
// left the cluster between when we got this response and now.
table.addCell(fullId ? nodeId : Strings.substring(nodeId, 0, 4));
table.addCell(node == null ? "-" : node.getHostAddress());
table.addCell(node.getAddress().address().getPort());
table.addCell(node == null ? "-" : node.getName());
table.addCell(node == null ? "-" : node.getVersion().toString());
....
}
Another diagnostic rule with the same problem triggered here. NullPointerException. The rule cries out for developers: «Guys, what are you doing? How could you do that? Oh, it is awful! Why do you first use the object and check if for null in the next line? Here is how null object dereference happens. Alas, even a developer’s comment didn’t help.
V6060 The 'cause' reference was utilized before it was verified against null. StartupException.java (76), StartupException.java (73)
private void printStackTrace(Consumer consumer) {
Throwable originalCause = getCause();
Throwable cause = originalCause;
if (cause instanceof CreationException) {
cause = getFirstGuiceCause((CreationException)cause);
}
String message = cause.toString(); // <=
consumer.accept(message);
if (cause != null) { // <=
// walk to the root cause
while (cause.getCause() != null) {
cause = cause.getCause();
}
....
}
....
}
In this case we should take into account that the getCause method of the Throwable class might return null. The above problem repeats further, so its explanation is needless.
Meaningless conditions
V6007 Expression 's.charAt (i) != '\t'' is always true. Cron.java (1223)
private static int findNextWhiteSpace(int i, String s) {
for (; i < s.length() && (s.charAt(i) != ' ' || s.charAt(i) != '\t'); i++)
{
// intentionally empty
}
return i;
}
The considered function returns the index of the first space character, starting from the i index. What’s wrong? We have the analyzer warning that s.charAt (i) != '\t' is always true, which means the expression (s.charAt (i) != ' ' || s.charAt (i) != '\t') will always be true as well. Is this true? I think, you can easily make sure of this, by substituting any character.
As a result, this method will always return the index, equal to s.length (), which is wrong. I would venture to suggest that the above method is to blame here:
private static int skipWhiteSpace(int i, String s) {
for (; i < s.length() && (s.charAt(i) == ' ' || s.charAt(i) == '\t'); i++)
{
// intentionally empty
}
return i;
}
Developers implemented the method, then copied and got our erroneous method findNextWhiteSpace, having made some edits. They kept fixing and fixing the method but haven’t fixed it up. To get this right, one should use the && operator instead of ||.
V6007 Expression 'remaining == 0' is always false. PemUtils.java (439)
private static byte[]
generateOpenSslKey(char[] password, byte[] salt, int keyLength)
{
....
int copied = 0;
int remaining;
while (copied < keyLength) {
remaining = keyLength - copied;
....
copied += bytesToCopy;
if (remaining == 0) { // <=
break;
}
....
}
....
}
From the condition of the copied < keyLength loop we can note, that copied will always be less than keyLength. Hence, it is pointless to compare the remaining variable with 0, and it will always be false, at which point the loop won’t exit by a condition. Remove the code or reconsider the behavior logic? I think, only developers will be able to put all the dots over the i.
V6007 Expression 'healthCheckDn.indexOf ('=') > 0' is always false. ActiveDirectorySessionFactory.java (73)
ActiveDirectorySessionFactory(RealmConfig config,
SSLService sslService,
ThreadPool threadPool)
throws LDAPException
{
super(....,
() -> {
if (....) {
final String healthCheckDn = ....;
if (healthCheckDn.isEmpty() &&
healthCheckDn.indexOf('=') > 0)
{
return healthCheckDn;
}
}
return ....;
},
....);
....
}
Meaningless expression again. According to the condition, the healthCheckDn string has to be both empty and contain the character '=' not in the first position, so that lambda expression would return the string variable healthCheckDn. Phew, that’s all then! As you probably understood, it is impossible. We’re not going to go deep in the code, let’s leave it to developers' discretion.
I cited only some of the erroneous examples, but beyond that there was plenty of the V6007 diagnostic triggerings, which should be regarded one by one, making relevant conclusions.
Little method can go a long way
private static byte char64(char x) {
if ((int)x < 0 || (int)x > index_64.length)
return -1;
return index_64[(int)x];
}
So here we have a teeny-tiny method of several lines. But bugs are on the watch! Analysis of this method gave the following result:
- V6007 Expression '(int)x < 0' is always false. BCrypt.java(429)
- V6025 Possibly index '(int) x' is out of bounds. BCrypt.java (431)
Issue N1. The expression (int)x < 0 is always false (Yes, V6007 again). The x variable cannot be negative, as it is of the char type. The char type is an unsigned integer. It cannot be called a real error, but, nonetheless, the check is redundant and can be removed.
Issue N2. Possible array index out of bounds, resulting in the ArrayIndexOutOfBoundsException exception. Then it begs the question, which lies on the surface: «Wait, how about the index check?»
So, we have a fixed-size array of 128 elements:
private static final byte index_64[] = {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, 0, 1, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, -1, -1,
-1, -1, -1, -1, -1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
-1, -1, -1, -1, -1, -1, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, -1, -1, -1, -1, -1
};
When the char64 method receives the x variable, the index validity gets checked. Where is the flaw? Why is array index out of bounds still possible?
The check (int)x > index_64.length is not quite correct. If the char64 method receives x with the value 128, the check won’t protect against ArrayIndexOutOfBoundsException. Maybe this never happens in reality. However, the check is written incorrectly, and one has to change«greater than» operator (>) with «greater than or equal to (> =).
Comparisons, which did their best
V6013 Numbers 'displaySize' and 'that.displaySize' are compared by reference. Possibly an equality comparison was intended. ColumnInfo.java (122)
....
private final String table;
private final String name;
private final String esType;
private final Integer displaySize;
....
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
ColumnInfo that = (ColumnInfo) o;
return displaySize == that.displaySize && // <=
Objects.equals(table, that.table) &&
Objects.equals(name, that.name) &&
Objects.equals(esType, that.esType);
}
What is incorrect here is that displaySize objects of the Integer type are compared using the == operator, that is, by reference. It’s quite possible that ColumnInfo objects, whose displaySize fields have different references but the same content, will be compared. In this case, comparison will give us the negative result, when we expected to get true.
I would venture to guess that such a comparison could be the result of a failed refactoring and initially the displaySize field was of the int type.
V6058 The 'equals' function compares objects of incompatible types: Integer, TimeValue. DatafeedUpdate.java (375)
....
private final TimeValue queryDelay;
private final TimeValue frequency;
....
private final Integer scrollSize;
....
boolean isNoop(DatafeedConfig datafeed)
{
return (frequency == null
|| Objects.equals(frequency, datafeed.getFrequency()))
&& (queryDelay == null
|| Objects.equals(queryDelay, datafeed.getQueryDelay()))
&& (scrollSize == null
|| Objects.equals(scrollSize, datafeed.getQueryDelay())) // <=
&& ....)
}
Incorrect objects comparison again. This time objects with incompatible types are compared (Integer and TimeValue). The result of this comparison is obvious, and it is always false. You can see that class fields are compared with each other, only the names of the fields have to be changed. Here is the point — a developer decided to speed up the process by using the copy-paste and got a bug into the bargain. The class implements a getter for the scrollSize field, so to fix the error one should use the method datafeed .getScrollSize ().
Let’s look at a couple of erroneous examples without any explanation. Problems are quite obvious.
V6001 There are identical sub-expressions 'tookInMillis' to the left and to the right of the '==' operator. TermVectorsResponse.java (152)
@Override
public boolean equals(Object obj) {
....
return index.equals(other.index)
&& type.equals(other.type)
&& Objects.equals(id, other.id)
&& docVersion == other.docVersion
&& found == other.found
&& tookInMillis == tookInMillis // <=
&& Objects.equals(termVectorList, other.termVectorList);
}
V6009 Function 'equals' receives an odd argument. An object 'shardId.getIndexName ()' is used as an argument to its own method. SnapshotShardFailure.java (208)
@Override
public boolean equals(Object o) {
....
return shardId.id() == that.shardId.id() &&
shardId.getIndexName().equals(shardId.getIndexName()) && // <=
Objects.equals(reason, that.reason) &&
Objects.equals(nodeId, that.nodeId) &&
status.getStatus() == that.status.getStatus();
}
Miscellaneous
V6006 The object was created but it is not being used. The 'throw' keyword could be missing. JdbcConnection.java (88)
@Override
public void setAutoCommit(boolean autoCommit) throws SQLException {
checkOpen();
if (!autoCommit) {
new SQLFeatureNotSupportedException(....);
}
}
The bug is obvious and requires no explanation. A developer created an exception, but didn’t throw it anywhere else. Such an anonymous exception is created successfully, as well as, most importantly, will be removed seamlessly. The reason is the lack of the throw operator.
V6003 The use of 'if (A) {…} else if (A) {…}' pattern was detected. There is a probability of logical error presence. MockScriptEngine.java (94), MockScriptEngine.java (105)
@Override
public T compile(....) {
....
if (context.instanceClazz.equals(FieldScript.class)) {
....
} else if (context.instanceClazz.equals(FieldScript.class)) {
....
} else if(context.instanceClazz.equals(TermsSetQueryScript.class)) {
....
} else if (context.instanceClazz.equals(NumberSortScript.class))
....
}
In multiple if- else one of the conditions is repeated twice, so the situation requires competent code review.
V6039 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SearchAfterBuilder.java (94), SearchAfterBuilder.java (93)
public SearchAfterBuilder setSortValues(Object[] values) {
....
for (int i = 0; i < values.length; i++) {
if (values[i] == null) continue;
if (values[i] instanceof String) continue;
if (values[i] instanceof Text) continue;
if (values[i] instanceof Long) continue;
if (values[i] instanceof Integer) continue;
if (values[i] instanceof Short) continue;
if (values[i] instanceof Byte) continue;
if (values[i] instanceof Double) continue;
if (values[i] instanceof Float) continue;
if (values[i] instanceof Boolean) continue; // <=
if (values[i] instanceof Boolean) continue; // <=
throw new IllegalArgumentException(....);
}
....
}
The same condition is used twice in a row. Is the second condition superfluous or should another type be used instead of Boolean?
V6009 Function 'substring' receives an odd arguments. The 'queryStringIndex + 1' argument should not be greater than 'queryStringLength'. LoggingAuditTrail.java (660)
LogEntryBuilder withRestUriAndMethod(RestRequest request) {
final int queryStringIndex = request.uri().indexOf('?');
int queryStringLength = request.uri().indexOf('#');
if (queryStringLength < 0) {
queryStringLength = request.uri().length();
}
if (queryStringIndex < 0) {
logEntry.with(....);
} else {
logEntry.with(....);
}
if (queryStringIndex > -1) {
logEntry.with(....,
request.uri().substring(queryStringIndex + 1,// <=
queryStringLength)); // <=
}
....
}
Let’s consider here the erroneous scenario which may cause the exception StringIndexOutOfBoundsException. The exception will occur when request.uri () returns a string which contains the character '#' before '?'. There are no checks in the method, so in case if it happens, the trouble is brewing. Perhaps, this will never happen due to various checks of the object outside the method, but setting hopes on this is not the best idea.
Conclusion
For many years PVS-Studio has been helping to find defects in code of commercial and free open-source projects. Just recently Java has joined the list of supported languages for analysis. Elasticsearch became one of the first tests for our newcomer. We hope that this check will be useful for the project and interesting for readers.
PVS-Studio for Java needs new challenges, new users, active feedback and clients in order to quickly adapt to the new world:). So I invite you to download and try our analyzer on your work project right away!